A significant number of hotel bookings are called-off due to cancellations or no-shows. The typical reasons for cancellations include change of plans, scheduling conflicts, etc. This is often made easier by the option to do so free of charge or preferably at a low cost which is beneficial to hotel guests but it is a less desirable and possibly revenue-diminishing factor for hotels to deal with. Such losses are particularly high on last-minute cancellations.
The new technologies involving online booking channels have dramatically changed customers’ booking possibilities and behavior. This adds a further dimension to the challenge of how hotels handle cancellations, which are no longer limited to traditional booking and guest characteristics.
The cancellation of bookings impact a hotel on various fronts:
The increasing number of cancellations calls for a Machine Learning based solution that can help in predicting which booking is likely to be canceled. INN Hotels Group has a chain of hotels in Portugal, they are facing problems with the high number of booking cancellations and have reached out to your firm for data-driven solutions. You as a data scientist have to analyze the data provided to find which factors have a high influence on booking cancellations, build a predictive model that can predict which booking is going to be canceled in advance, and help in formulating profitable policies for cancellations and refunds.
The data contains the different attributes of customers' booking details. The detailed data dictionary is given below.
Data Dictionary
# Importing appropriate libraries
# For loading data and data processing
import numpy as np
import pandas as pd
# For visualizations
import matplotlib.pyplot as plt
import seaborn as sns
# For splitting the data into train/test
from sklearn.model_selection import train_test_split
# For handling warnings
import warnings
warnings.filterwarnings("ignore")
from statsmodels.tools.sm_exceptions import ConvergenceWarning
warnings.simplefilter("ignore", ConvergenceWarning)
# For building a logistic regression prediction model
import statsmodels.stats.api as sms
from statsmodels.stats.outliers_influence import variance_inflation_factor
import statsmodels.api as sm
from statsmodels.tools.tools import add_constant
from sklearn.linear_model import LogisticRegression
# For building a decision trees prediction model
from sklearn.tree import DecisionTreeClassifier
from sklearn import tree
# For tuning decision tree
from sklearn.model_selection import GridSearchCV
# For performing statistical analysis
import scipy.stats as stats
# To get diferent metric scores (for both logistic regression and decision trees)
from sklearn.metrics import (
f1_score,
accuracy_score,
recall_score,
precision_score,
confusion_matrix,
roc_auc_score,
plot_confusion_matrix,
precision_recall_curve,
roc_curve,
make_scorer,
)
# Reading in data into a datframe
df = pd.read_csv("INNHotelsGroup.csv")
df
| Booking_ID | no_of_adults | no_of_children | no_of_weekend_nights | no_of_week_nights | type_of_meal_plan | required_car_parking_space | room_type_reserved | lead_time | arrival_year | arrival_month | arrival_date | market_segment_type | repeated_guest | no_of_previous_cancellations | no_of_previous_bookings_not_canceled | avg_price_per_room | no_of_special_requests | booking_status | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | INN00001 | 2 | 0 | 1 | 2 | Meal Plan 1 | 0 | Room_Type 1 | 224 | 2017 | 10 | 2 | Offline | 0 | 0 | 0 | 65.00 | 0 | Not_Canceled |
| 1 | INN00002 | 2 | 0 | 2 | 3 | Not Selected | 0 | Room_Type 1 | 5 | 2018 | 11 | 6 | Online | 0 | 0 | 0 | 106.68 | 1 | Not_Canceled |
| 2 | INN00003 | 1 | 0 | 2 | 1 | Meal Plan 1 | 0 | Room_Type 1 | 1 | 2018 | 2 | 28 | Online | 0 | 0 | 0 | 60.00 | 0 | Canceled |
| 3 | INN00004 | 2 | 0 | 0 | 2 | Meal Plan 1 | 0 | Room_Type 1 | 211 | 2018 | 5 | 20 | Online | 0 | 0 | 0 | 100.00 | 0 | Canceled |
| 4 | INN00005 | 2 | 0 | 1 | 1 | Not Selected | 0 | Room_Type 1 | 48 | 2018 | 4 | 11 | Online | 0 | 0 | 0 | 94.50 | 0 | Canceled |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 36270 | INN36271 | 3 | 0 | 2 | 6 | Meal Plan 1 | 0 | Room_Type 4 | 85 | 2018 | 8 | 3 | Online | 0 | 0 | 0 | 167.80 | 1 | Not_Canceled |
| 36271 | INN36272 | 2 | 0 | 1 | 3 | Meal Plan 1 | 0 | Room_Type 1 | 228 | 2018 | 10 | 17 | Online | 0 | 0 | 0 | 90.95 | 2 | Canceled |
| 36272 | INN36273 | 2 | 0 | 2 | 6 | Meal Plan 1 | 0 | Room_Type 1 | 148 | 2018 | 7 | 1 | Online | 0 | 0 | 0 | 98.39 | 2 | Not_Canceled |
| 36273 | INN36274 | 2 | 0 | 0 | 3 | Not Selected | 0 | Room_Type 1 | 63 | 2018 | 4 | 21 | Online | 0 | 0 | 0 | 94.50 | 0 | Canceled |
| 36274 | INN36275 | 2 | 0 | 1 | 2 | Meal Plan 1 | 0 | Room_Type 1 | 207 | 2018 | 12 | 30 | Offline | 0 | 0 | 0 | 161.67 | 0 | Not_Canceled |
36275 rows × 19 columns
# Viewing first five rows
df.head()
| Booking_ID | no_of_adults | no_of_children | no_of_weekend_nights | no_of_week_nights | type_of_meal_plan | required_car_parking_space | room_type_reserved | lead_time | arrival_year | arrival_month | arrival_date | market_segment_type | repeated_guest | no_of_previous_cancellations | no_of_previous_bookings_not_canceled | avg_price_per_room | no_of_special_requests | booking_status | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | INN00001 | 2 | 0 | 1 | 2 | Meal Plan 1 | 0 | Room_Type 1 | 224 | 2017 | 10 | 2 | Offline | 0 | 0 | 0 | 65.00 | 0 | Not_Canceled |
| 1 | INN00002 | 2 | 0 | 2 | 3 | Not Selected | 0 | Room_Type 1 | 5 | 2018 | 11 | 6 | Online | 0 | 0 | 0 | 106.68 | 1 | Not_Canceled |
| 2 | INN00003 | 1 | 0 | 2 | 1 | Meal Plan 1 | 0 | Room_Type 1 | 1 | 2018 | 2 | 28 | Online | 0 | 0 | 0 | 60.00 | 0 | Canceled |
| 3 | INN00004 | 2 | 0 | 0 | 2 | Meal Plan 1 | 0 | Room_Type 1 | 211 | 2018 | 5 | 20 | Online | 0 | 0 | 0 | 100.00 | 0 | Canceled |
| 4 | INN00005 | 2 | 0 | 1 | 1 | Not Selected | 0 | Room_Type 1 | 48 | 2018 | 4 | 11 | Online | 0 | 0 | 0 | 94.50 | 0 | Canceled |
# Viewing a sample of the data
df.sample(10)
| Booking_ID | no_of_adults | no_of_children | no_of_weekend_nights | no_of_week_nights | type_of_meal_plan | required_car_parking_space | room_type_reserved | lead_time | arrival_year | arrival_month | arrival_date | market_segment_type | repeated_guest | no_of_previous_cancellations | no_of_previous_bookings_not_canceled | avg_price_per_room | no_of_special_requests | booking_status | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 18226 | INN18227 | 2 | 1 | 2 | 3 | Meal Plan 1 | 0 | Room_Type 1 | 5 | 2018 | 3 | 26 | Online | 0 | 0 | 0 | 145.00 | 1 | Not_Canceled |
| 22851 | INN22852 | 1 | 0 | 0 | 1 | Meal Plan 1 | 0 | Room_Type 1 | 156 | 2018 | 8 | 11 | Online | 0 | 0 | 0 | 108.90 | 0 | Canceled |
| 9369 | INN09370 | 2 | 0 | 2 | 2 | Not Selected | 0 | Room_Type 1 | 38 | 2018 | 8 | 27 | Online | 0 | 0 | 0 | 107.55 | 2 | Not_Canceled |
| 4092 | INN04093 | 2 | 0 | 0 | 3 | Meal Plan 2 | 0 | Room_Type 1 | 34 | 2017 | 9 | 23 | Online | 0 | 0 | 0 | 224.67 | 0 | Canceled |
| 9698 | INN09699 | 2 | 0 | 0 | 2 | Meal Plan 2 | 0 | Room_Type 1 | 265 | 2018 | 6 | 24 | Offline | 0 | 0 | 0 | 115.00 | 1 | Canceled |
| 14154 | INN14155 | 2 | 0 | 0 | 2 | Meal Plan 1 | 0 | Room_Type 1 | 211 | 2018 | 5 | 20 | Offline | 0 | 0 | 0 | 100.00 | 0 | Canceled |
| 27736 | INN27737 | 2 | 0 | 1 | 3 | Meal Plan 1 | 0 | Room_Type 2 | 62 | 2018 | 2 | 15 | Online | 0 | 0 | 0 | 65.66 | 1 | Not_Canceled |
| 5081 | INN05082 | 1 | 0 | 0 | 2 | Meal Plan 1 | 0 | Room_Type 1 | 164 | 2017 | 10 | 2 | Offline | 0 | 0 | 0 | 100.00 | 0 | Not_Canceled |
| 3955 | INN03956 | 2 | 0 | 2 | 3 | Meal Plan 1 | 0 | Room_Type 1 | 256 | 2018 | 10 | 16 | Online | 0 | 0 | 0 | 100.75 | 0 | Canceled |
| 5176 | INN05177 | 3 | 0 | 2 | 1 | Meal Plan 1 | 0 | Room_Type 4 | 103 | 2018 | 8 | 7 | Online | 0 | 0 | 0 | 150.30 | 1 | Not_Canceled |
# Shape of the dataframe
print(f"The shape of the dataframe is {df.shape}. \nThere are {df.shape[0]} rows and {df.shape[1]} columns")
The shape of the dataframe is (36275, 19). There are 36275 rows and 19 columns
# Datatypes in dataframe
print(df.dtypes)
Booking_ID object no_of_adults int64 no_of_children int64 no_of_weekend_nights int64 no_of_week_nights int64 type_of_meal_plan object required_car_parking_space int64 room_type_reserved object lead_time int64 arrival_year int64 arrival_month int64 arrival_date int64 market_segment_type object repeated_guest int64 no_of_previous_cancellations int64 no_of_previous_bookings_not_canceled int64 avg_price_per_room float64 no_of_special_requests int64 booking_status object dtype: object
# Making a copy of the original dataframe to preserve the original data
df1 = df.copy()
# Converting Booking_ID, type_of_meal_plan, room_type_reserved, market_segment_type, and booking_status to categorical type from object type
cols = ["Booking_ID", "type_of_meal_plan", "room_type_reserved", "market_segment_type", "booking_status"]
df1[cols] = df1[cols].astype("category")
# Confirming that datatype convesion was successful and there are no more object datatypes
print(df1.dtypes)
Booking_ID category no_of_adults int64 no_of_children int64 no_of_weekend_nights int64 no_of_week_nights int64 type_of_meal_plan category required_car_parking_space int64 room_type_reserved category lead_time int64 arrival_year int64 arrival_month int64 arrival_date int64 market_segment_type category repeated_guest int64 no_of_previous_cancellations int64 no_of_previous_bookings_not_canceled int64 avg_price_per_room float64 no_of_special_requests int64 booking_status category dtype: object
# Determining if there are any missing values in the dataframe
# Total missing values for each column sorted in descending order
print(df1.isnull().sum().sort_values(ascending=False))
Booking_ID 0 arrival_month 0 no_of_special_requests 0 avg_price_per_room 0 no_of_previous_bookings_not_canceled 0 no_of_previous_cancellations 0 repeated_guest 0 market_segment_type 0 arrival_date 0 arrival_year 0 no_of_adults 0 lead_time 0 room_type_reserved 0 required_car_parking_space 0 type_of_meal_plan 0 no_of_week_nights 0 no_of_weekend_nights 0 no_of_children 0 booking_status 0 dtype: int64
# Any missing values?
print("True or False, are there any missing values?", df1.isnull().values.any())
True or False, are there any missing values? False
# Total number of missing values (if any)
print("Total number of missing values (if any) is", df1.isnull().sum().sum())
Total number of missing values (if any) is 0
# Number of unique values in each column sorted in descending order
print(df1.nunique().sort_values(ascending=False))
Booking_ID 36275 avg_price_per_room 3930 lead_time 352 no_of_previous_bookings_not_canceled 59 arrival_date 31 no_of_week_nights 18 arrival_month 12 no_of_previous_cancellations 9 no_of_weekend_nights 8 room_type_reserved 7 no_of_children 6 no_of_special_requests 6 no_of_adults 5 market_segment_type 5 type_of_meal_plan 4 arrival_year 2 repeated_guest 2 required_car_parking_space 2 booking_status 2 dtype: int64
# Distribution of numerical variables
sns.set_style("darkgrid")
df1.hist(figsize = (20, 15))
plt.show()
# Statistical summary of all data
df1.describe(include = 'all').T
| count | unique | top | freq | mean | std | min | 25% | 50% | 75% | max | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Booking_ID | 36275 | 36275 | INN00001 | 1 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| no_of_adults | 36275.0 | NaN | NaN | NaN | 1.844962 | 0.518715 | 0.0 | 2.0 | 2.0 | 2.0 | 4.0 |
| no_of_children | 36275.0 | NaN | NaN | NaN | 0.105279 | 0.402648 | 0.0 | 0.0 | 0.0 | 0.0 | 10.0 |
| no_of_weekend_nights | 36275.0 | NaN | NaN | NaN | 0.810724 | 0.870644 | 0.0 | 0.0 | 1.0 | 2.0 | 7.0 |
| no_of_week_nights | 36275.0 | NaN | NaN | NaN | 2.2043 | 1.410905 | 0.0 | 1.0 | 2.0 | 3.0 | 17.0 |
| type_of_meal_plan | 36275 | 4 | Meal Plan 1 | 27835 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| required_car_parking_space | 36275.0 | NaN | NaN | NaN | 0.030986 | 0.173281 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 |
| room_type_reserved | 36275 | 7 | Room_Type 1 | 28130 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| lead_time | 36275.0 | NaN | NaN | NaN | 85.232557 | 85.930817 | 0.0 | 17.0 | 57.0 | 126.0 | 443.0 |
| arrival_year | 36275.0 | NaN | NaN | NaN | 2017.820427 | 0.383836 | 2017.0 | 2018.0 | 2018.0 | 2018.0 | 2018.0 |
| arrival_month | 36275.0 | NaN | NaN | NaN | 7.423653 | 3.069894 | 1.0 | 5.0 | 8.0 | 10.0 | 12.0 |
| arrival_date | 36275.0 | NaN | NaN | NaN | 15.596995 | 8.740447 | 1.0 | 8.0 | 16.0 | 23.0 | 31.0 |
| market_segment_type | 36275 | 5 | Online | 23214 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| repeated_guest | 36275.0 | NaN | NaN | NaN | 0.025637 | 0.158053 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 |
| no_of_previous_cancellations | 36275.0 | NaN | NaN | NaN | 0.023349 | 0.368331 | 0.0 | 0.0 | 0.0 | 0.0 | 13.0 |
| no_of_previous_bookings_not_canceled | 36275.0 | NaN | NaN | NaN | 0.153411 | 1.754171 | 0.0 | 0.0 | 0.0 | 0.0 | 58.0 |
| avg_price_per_room | 36275.0 | NaN | NaN | NaN | 103.423539 | 35.089424 | 0.0 | 80.3 | 99.45 | 120.0 | 540.0 |
| no_of_special_requests | 36275.0 | NaN | NaN | NaN | 0.619655 | 0.786236 | 0.0 | 0.0 | 0.0 | 1.0 | 5.0 |
| booking_status | 36275 | 2 | Not_Canceled | 24390 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
booking_status) doesn't have any missing values and is type category.booking_status) is (Not_Canceled), with 24,390 values, suggesting the data is imbalanced.# function to plot a boxplot and a histogram along the same scale.
def histogram_boxplot(data, feature, figsize=(12, 7), kde=False, bins=None):
"""
Boxplot and histogram combined
data: dataframe
feature: dataframe column
figsize: size of figure (default (12,7))
kde: whether to the show density curve (default False)
bins: number of bins for histogram (default None)
"""
f2, (ax_box2, ax_hist2) = plt.subplots(
nrows=2, # Number of rows of the subplot grid= 2
sharex=True, # x-axis will be shared among all subplots
gridspec_kw={"height_ratios": (0.25, 0.75)},
figsize=figsize,
) # creating the 2 subplots
sns.boxplot(
data=data, x=feature, ax=ax_box2, showmeans=True, color="violet"
) # boxplot will be created and a star will indicate the mean value of the column
sns.histplot(
data=data, x=feature, kde=kde, ax=ax_hist2, bins=bins, palette="winter"
) if bins else sns.histplot(
data=data, x=feature, kde=kde, ax=ax_hist2
) # For histogram
ax_hist2.axvline(
data[feature].mean(), color="green", linestyle="--"
) # Add mean to the histogram
ax_hist2.axvline(
data[feature].median(), color="black", linestyle="-"
) # Add median to the histogram
# function to create labeled barplots
def labeled_barplot(data, feature, perc=False, n=None):
"""
Barplot with percentage at the top
data: dataframe
feature: dataframe column
perc: whether to display percentages instead of count (default is False)
n: displays the top n category levels (default is None, i.e., display all levels)
"""
total = len(data[feature]) # length of the column
count = data[feature].nunique()
if n is None:
plt.figure(figsize=(count + 1, 5))
else:
plt.figure(figsize=(n + 1, 5))
plt.xticks(rotation=90, fontsize=15)
ax = sns.countplot(
data=data,
x=feature,
palette="Paired",
order=data[feature].value_counts().index[:n].sort_values(),
)
for p in ax.patches:
if perc == True:
label = "{:.1f}%".format(
100 * p.get_height() / total
) # percentage of each class of the category
else:
label = p.get_height() # count of each level of the category
x = p.get_x() + p.get_width() / 2 # width of the plot
y = p.get_height() # height of the plot
ax.annotate(
label,
(x, y),
ha="center",
va="center",
size=12,
xytext=(0, 5),
textcoords="offset points",
) # annotate the percentage
plt.show() # show the plot
# Number of adults for a booking
labeled_barplot(df1, "no_of_adults")
# Number of adults for a booking as a percentage
labeled_barplot(df1, "no_of_adults", perc = True)
# Number of adults for a booking histogram and boxplot
histogram_boxplot(df1, "no_of_adults")
# Number of children for a booking
labeled_barplot(df1, "no_of_children")
# Number of children for a booking as a percentage
labeled_barplot(df1, "no_of_children", perc = True)
# Number of children for a booking histogram and boxplot
histogram_boxplot(df1, "no_of_children")
# Number of weekend nights for a booking
labeled_barplot(df1, "no_of_weekend_nights")
# Number of weekend nights for a booking as a percentage
labeled_barplot(df1, "no_of_weekend_nights", perc = True)
# Number of weekend nights for a booking histogram and boxplot
histogram_boxplot(df1, "no_of_weekend_nights")
# Number of week nights for a booking
labeled_barplot(df1, "no_of_week_nights")
# Number of week nights for a booking as a percentage
labeled_barplot(df1, "no_of_week_nights", perc = True)
# Number of week nights for a booking histogram and boxplot
histogram_boxplot(df1, "no_of_week_nights")
# Type of meal plan for a booking
labeled_barplot(df1, "type_of_meal_plan")
# Type of meal plan for a booking as a percentage
labeled_barplot(df1, "type_of_meal_plan", perc = True)
# Required number of parking spaces for a booking
labeled_barplot(df1, "required_car_parking_space")
# Required number of parking spaces for a booking as a percentage
labeled_barplot(df1, "required_car_parking_space", perc = True)
# Number of parking spaces for a booking histogram and boxplot
histogram_boxplot(df1, "required_car_parking_space")
# Room type reserved for a booking
labeled_barplot(df1, "room_type_reserved")
# Room type reserved for a booking as a percentage
labeled_barplot(df1, "room_type_reserved", perc = True)
# Lead time (number of days between booking and arrival date) for a booking
labeled_barplot(df1, "lead_time")
# Since the values for each lead time is too many for a barplot, displaying value counts for each lead time
df1.lead_time.value_counts().sort_values(ascending=False)
0 1297
1 1078
2 643
3 630
4 628
...
348 1
345 1
350 1
325 1
351 1
Name: lead_time, Length: 352, dtype: int64
# Since the values for each lead time is too many for a barplot, displaying value counts for each lead time as a percentage
df1.lead_time.value_counts(normalize = True).sort_values(ascending=False) * 100
0 3.575465
1 2.971744
2 1.772571
3 1.736733
4 1.731220
...
348 0.002757
345 0.002757
350 0.002757
325 0.002757
351 0.002757
Name: lead_time, Length: 352, dtype: float64
# Lead time histogram and boxplot
histogram_boxplot(df1, "lead_time")
# Arrival Year for a booking
labeled_barplot(df1, "arrival_year")
# Arrival Year for a booking as a percentage
labeled_barplot(df1, "arrival_year", perc = True)
# Arrival Year for a booking histogram and boxplot
histogram_boxplot(df1, "arrival_year")
# Arrival month for a booking
labeled_barplot(df1, "arrival_month")
# Arrival month for a booking as a percentage
labeled_barplot(df1, "arrival_month", perc = True)
# Arrival month for a booking histogram and boxplot
histogram_boxplot(df1, "arrival_month")
# Arrival date for a booking
labeled_barplot(df1, "arrival_date")
# Since the values for each arrival time is too many for a barplot, displaying value counts for each arrival time
df1.arrival_date.value_counts().sort_values(ascending=False)
13 1358 17 1345 2 1331 4 1327 19 1327 16 1306 20 1281 15 1273 6 1273 18 1260 14 1242 30 1216 12 1204 8 1198 29 1190 21 1158 5 1154 26 1146 25 1146 1 1133 9 1130 28 1129 7 1110 24 1103 11 1098 3 1098 10 1089 27 1059 22 1023 23 990 31 578 Name: arrival_date, dtype: int64
# Since the values for each arrival time is too many for a barplot, displaying value counts for each arrival time as a percentage
df1.arrival_date.value_counts(normalize = True).sort_values(ascending=False) * 100
13 3.743625 17 3.707788 2 3.669194 4 3.658167 19 3.658167 16 3.600276 20 3.531358 15 3.509304 6 3.509304 18 3.473467 14 3.423846 30 3.352171 12 3.319090 8 3.302550 29 3.280496 21 3.192281 5 3.181254 26 3.159201 25 3.159201 1 3.123363 9 3.115093 28 3.112336 7 3.059959 24 3.040662 11 3.026878 3 3.026878 10 3.002068 27 2.919366 22 2.820124 23 2.729152 31 1.593384 Name: arrival_date, dtype: float64
# Arrival date histogram and boxplot
histogram_boxplot(df1, "arrival_date")
# Market segment type for a booking
labeled_barplot(df1, "market_segment_type")
# Market segment type as a percentage for a booking
labeled_barplot(df1, "market_segment_type", perc = True)
# Repeat guest for a booking
labeled_barplot(df1, "repeated_guest")
# Repeat guest for a booking as a percentage
labeled_barplot(df1, "repeated_guest", perc = True)
# Repeat guest for a booking histogram and boxplot
histogram_boxplot(df1, "repeated_guest")
# Number of previous booking cancellations for a booking
labeled_barplot(df1, "no_of_previous_cancellations")
# Number of previous booking cancellations for a booking as a percentage
labeled_barplot(df1, "no_of_previous_cancellations", perc = True)
# Number of previous booking cancellations for a booking histogram and boxplot
histogram_boxplot(df1, "no_of_previous_cancellations")
# Number of previous bookings not canceled for a booking
labeled_barplot(df1, "no_of_previous_bookings_not_canceled")
# Since the values for number of previous bookings not canceled is too many for a barplot, displaying value counts for each number of previous bookings not canceled
df1.no_of_previous_bookings_not_canceled.value_counts().sort_values(ascending=False)
0 35463 1 228 2 112 3 80 4 65 5 60 6 36 7 24 8 23 10 19 9 19 11 15 12 12 14 9 15 8 13 7 16 7 18 6 20 6 21 6 17 6 19 6 22 6 25 3 27 3 24 3 23 3 26 2 31 2 30 2 32 2 48 2 28 2 44 2 29 2 56 1 47 1 49 1 52 1 39 1 34 1 38 1 51 1 42 1 37 1 33 1 35 1 50 1 43 1 40 1 41 1 58 1 54 1 53 1 57 1 45 1 55 1 46 1 36 1 Name: no_of_previous_bookings_not_canceled, dtype: int64
# Since the values for number of previous bookings not canceled is too many for a barplot, displaying value counts for each number of previous bookings not canceled as a percentage
df1.no_of_previous_bookings_not_canceled.value_counts(normalize = True).sort_values(ascending=False) * 100
0 97.761544 1 0.628532 2 0.308753 3 0.220538 4 0.179187 5 0.165403 6 0.099242 7 0.066161 8 0.063405 10 0.052378 9 0.052378 11 0.041351 12 0.033081 14 0.024810 15 0.022054 13 0.019297 16 0.019297 18 0.016540 20 0.016540 21 0.016540 17 0.016540 19 0.016540 22 0.016540 25 0.008270 27 0.008270 24 0.008270 23 0.008270 26 0.005513 31 0.005513 30 0.005513 32 0.005513 48 0.005513 28 0.005513 44 0.005513 29 0.005513 56 0.002757 47 0.002757 49 0.002757 52 0.002757 39 0.002757 34 0.002757 38 0.002757 51 0.002757 42 0.002757 37 0.002757 33 0.002757 35 0.002757 50 0.002757 43 0.002757 40 0.002757 41 0.002757 58 0.002757 54 0.002757 53 0.002757 57 0.002757 45 0.002757 55 0.002757 46 0.002757 36 0.002757 Name: no_of_previous_bookings_not_canceled, dtype: float64
# Number of previous bookings not canceled for a booking histogram and boxplot
histogram_boxplot(df1, "no_of_previous_bookings_not_canceled")
# Average price of a room booking (in Euros)
# Since the values for average price for a room booking is too many for a barplot, displaying value counts for each average price for a room booking
df1.avg_price_per_room.value_counts().sort_values(ascending=False)
65.00 848
75.00 826
90.00 703
95.00 669
115.00 662
...
103.86 1
129.83 1
246.60 1
86.85 1
167.80 1
Name: avg_price_per_room, Length: 3930, dtype: int64
# Since the values for average price for a room booking is too many for a barplot, displaying value counts for each average price for a room booking as a percentage
df1.avg_price_per_room.value_counts(normalize = True).sort_values(ascending=False) * 100
65.00 2.337698
75.00 2.277050
90.00 1.937974
95.00 1.844245
115.00 1.824948
...
103.86 0.002757
129.83 0.002757
246.60 0.002757
86.85 0.002757
167.80 0.002757
Name: avg_price_per_room, Length: 3930, dtype: float64
# Average price of a room booking boxplot and histogram
histogram_boxplot(df1, "avg_price_per_room")
# Number of special requests per bookings
labeled_barplot(df1, "no_of_special_requests")
# Number of special requests per bookings as a percentage
labeled_barplot(df1, "no_of_special_requests", perc = True)
# Number of special requests per booking histogram and boxplot
histogram_boxplot(df1, "no_of_special_requests")
# Booking status per bookings
labeled_barplot(df1, "booking_status")
# Booking status per bookings as a percentage
labeled_barplot(df1, "booking_status", perc = True)
# Correlation heatmap for all numerical variables
plt.figure(figsize=(15, 7))
sns.heatmap(df1.corr(), annot=True, vmin=-1, vmax=1, fmt=".2f", cmap="Spectral")
plt.show()
# Pairplots with hue as booking status (the target variable)
sns.pairplot(df1, hue = "booking_status")
plt.show()
def stacked_barplot(data, predictor, target):
"""
Print the category counts and plot a stacked bar chart
data: dataframe
predictor: independent variable
target: target variable
"""
count = data[predictor].nunique()
sorter = data[target].value_counts().index[-1]
tab1 = pd.crosstab(data[predictor], data[target], margins=True).sort_values(
by=sorter, ascending=False
)
print(tab1)
print("-" * 120)
tab = pd.crosstab(data[predictor], data[target], normalize="index").sort_values(
by=sorter, ascending=False
)
tab.plot(kind="bar", stacked=True, figsize=(count + 5, 5))
plt.legend(
loc="lower left", frameon=False,
)
plt.legend(loc="upper left", bbox_to_anchor=(1, 1))
plt.show()
### function to plot distributions wrt target
def distribution_plot_wrt_target(data, predictor, target):
fig, axs = plt.subplots(2, 2, figsize=(12, 10))
target_uniq = data[target].unique()
axs[0, 0].set_title("Distribution of target for target=" + str(target_uniq[0]))
sns.histplot(
data=data[data[target] == target_uniq[0]],
x=predictor,
kde=True,
ax=axs[0, 0],
color="teal",
stat="density",
)
axs[0, 1].set_title("Distribution of target for target=" + str(target_uniq[1]))
sns.histplot(
data=data[data[target] == target_uniq[1]],
x=predictor,
kde=True,
ax=axs[0, 1],
color="orange",
stat="density",
)
axs[1, 0].set_title("Boxplot w.r.t target")
sns.boxplot(data=data, x=target, y=predictor, ax=axs[1, 0], palette="gist_rainbow")
axs[1, 1].set_title("Boxplot (without outliers) w.r.t target")
sns.boxplot(
data=data,
x=target,
y=predictor,
ax=axs[1, 1],
showfliers=False,
palette="gist_rainbow",
)
plt.tight_layout()
plt.show()
# Booking_status vs no_of_adults stacked barplot
stacked_barplot(df1, "no_of_adults", "booking_status")
booking_status Canceled Not_Canceled All no_of_adults All 11885 24390 36275 2 9119 16989 26108 1 1856 5839 7695 3 863 1454 2317 0 44 95 139 4 3 13 16 ------------------------------------------------------------------------------------------------------------------------
# Booking_status vs no_of_adults distribution plot
distribution_plot_wrt_target(df1, "no_of_adults", "booking_status")
# Booking_status vs no_of_children stacked barplot
stacked_barplot(df1, "no_of_children", "booking_status")
booking_status Canceled Not_Canceled All no_of_children All 11885 24390 36275 0 10882 22695 33577 1 540 1078 1618 2 457 601 1058 3 5 14 19 9 1 1 2 10 0 1 1 ------------------------------------------------------------------------------------------------------------------------
# Booking_status vs no_of_children distribution plot
distribution_plot_wrt_target(df1, "no_of_children", "booking_status")
# Booking_status vs no_of_weekend_nights stacked barplot
stacked_barplot(df1, "no_of_weekend_nights", "booking_status")
booking_status Canceled Not_Canceled All no_of_weekend_nights All 11885 24390 36275 0 5093 11779 16872 1 3432 6563 9995 2 3157 5914 9071 4 83 46 129 3 74 79 153 5 29 5 34 6 16 4 20 7 1 0 1 ------------------------------------------------------------------------------------------------------------------------
# Booking_status vs no_of_weekend_nights distribution plot
distribution_plot_wrt_target(df1, "no_of_weekend_nights", "booking_status")
# Booking_status vs no_of_week_nights stacked barplot
stacked_barplot(df1, "no_of_week_nights", "booking_status")
booking_status Canceled Not_Canceled All no_of_week_nights All 11885 24390 36275 2 3997 7447 11444 3 2574 5265 7839 1 2572 6916 9488 4 1143 1847 2990 0 679 1708 2387 5 632 982 1614 6 88 101 189 10 53 9 62 7 52 61 113 8 32 30 62 9 21 13 34 11 14 3 17 15 8 2 10 12 7 2 9 13 5 0 5 14 4 3 7 16 2 0 2 17 2 1 3 ------------------------------------------------------------------------------------------------------------------------
# Booking_status vs no_of_week_nights distribution plot
distribution_plot_wrt_target(df1, "no_of_week_nights", "booking_status")
# Booking_status vs type_of_meal_plan stacked barplot
stacked_barplot(df1, "type_of_meal_plan", "booking_status")
booking_status Canceled Not_Canceled All type_of_meal_plan All 11885 24390 36275 Meal Plan 1 8679 19156 27835 Not Selected 1699 3431 5130 Meal Plan 2 1506 1799 3305 Meal Plan 3 1 4 5 ------------------------------------------------------------------------------------------------------------------------
# Booking_status vs type_of_meal_plan distribution plot
sns.histplot(df1, x = "type_of_meal_plan", hue = "booking_status", kde = True)
<AxesSubplot:xlabel='type_of_meal_plan', ylabel='Count'>
# Booking_status vs required_car_parking_space stacked barplot
stacked_barplot(df1, "required_car_parking_space", "booking_status")
booking_status Canceled Not_Canceled All required_car_parking_space All 11885 24390 36275 0 11771 23380 35151 1 114 1010 1124 ------------------------------------------------------------------------------------------------------------------------
# Booking_status vs required_car_parking_space distribution plot
distribution_plot_wrt_target(df1, "required_car_parking_space", "booking_status")
# Booking_status vs room_type_reserved stacked barplot
stacked_barplot(df1, "room_type_reserved", "booking_status")
booking_status Canceled Not_Canceled All room_type_reserved All 11885 24390 36275 Room_Type 1 9072 19058 28130 Room_Type 4 2069 3988 6057 Room_Type 6 406 560 966 Room_Type 2 228 464 692 Room_Type 5 72 193 265 Room_Type 7 36 122 158 Room_Type 3 2 5 7 ------------------------------------------------------------------------------------------------------------------------
# Booking_status vs room_type_reserved distribution plot
sns.histplot(df1, x = "room_type_reserved", hue = "booking_status", kde = True)
plt.xticks(rotation = 90)
([0, 1, 2, 3, 4, 5, 6], [Text(0, 0, ''), Text(0, 0, ''), Text(0, 0, ''), Text(0, 0, ''), Text(0, 0, ''), Text(0, 0, ''), Text(0, 0, '')])
# Booking_status vs lead_time stacked displot
sns.displot(df1, x = "lead_time", hue = "booking_status", bins = 50)
<seaborn.axisgrid.FacetGrid at 0x22cea9541f0>
# Booking_status vs lead_time distribution plot
distribution_plot_wrt_target(df1, "lead_time", "booking_status")
# Booking_status vs arrival_year stacked barplot
stacked_barplot(df1, "arrival_year", "booking_status")
booking_status Canceled Not_Canceled All arrival_year All 11885 24390 36275 2018 10924 18837 29761 2017 961 5553 6514 ------------------------------------------------------------------------------------------------------------------------
# Booking_status vs arrival_year distribution plot
distribution_plot_wrt_target(df1, "arrival_year", "booking_status")
# Booking_status vs arrival_month stacked barplot
stacked_barplot(df1, "arrival_month", "booking_status")
booking_status Canceled Not_Canceled All arrival_month All 11885 24390 36275 10 1880 3437 5317 9 1538 3073 4611 8 1488 2325 3813 7 1314 1606 2920 6 1291 1912 3203 4 995 1741 2736 5 948 1650 2598 11 875 2105 2980 3 700 1658 2358 2 430 1274 1704 12 402 2619 3021 1 24 990 1014 ------------------------------------------------------------------------------------------------------------------------
# Booking_status vs arrival_month distribution plot
distribution_plot_wrt_target(df1, "arrival_month", "booking_status")
# Booking_status vs arrival_date stacked barplot
stacked_barplot(df1, "arrival_date", "booking_status")
booking_status Canceled Not_Canceled All arrival_date All 11885 24390 36275 15 538 735 1273 4 474 853 1327 16 473 833 1306 30 465 751 1216 1 465 668 1133 12 460 744 1204 17 448 897 1345 6 444 829 1273 26 425 721 1146 19 413 914 1327 20 413 868 1281 13 408 950 1358 28 405 724 1129 3 403 695 1098 25 395 751 1146 21 376 782 1158 24 372 731 1103 18 366 894 1260 7 364 746 1110 8 356 842 1198 22 351 672 1023 23 341 649 990 29 334 856 1190 11 330 768 1098 5 328 826 1154 14 327 915 1242 10 318 771 1089 27 313 746 1059 2 308 1023 1331 9 294 836 1130 31 178 400 578 ------------------------------------------------------------------------------------------------------------------------
# Booking_status vs arrival_date distribution plot
distribution_plot_wrt_target(df1, "arrival_date", "booking_status")
# Booking_status vs market_segment_type stacked barplot
stacked_barplot(df1, "market_segment_type", "booking_status")
booking_status Canceled Not_Canceled All market_segment_type All 11885 24390 36275 Online 8475 14739 23214 Offline 3153 7375 10528 Corporate 220 1797 2017 Aviation 37 88 125 Complementary 0 391 391 ------------------------------------------------------------------------------------------------------------------------
# Booking_status vs market_segment_type distribution plot
sns.histplot(df1, x = "market_segment_type", hue = "booking_status", kde = True)
<AxesSubplot:xlabel='market_segment_type', ylabel='Count'>
# Booking_status vs repeated_guest stacked barplot
stacked_barplot(df1, "repeated_guest", "booking_status")
booking_status Canceled Not_Canceled All repeated_guest All 11885 24390 36275 0 11869 23476 35345 1 16 914 930 ------------------------------------------------------------------------------------------------------------------------
# Booking_status vs repeated_guest distribution plot
distribution_plot_wrt_target(df1, "repeated_guest", "booking_status")
# Booking_status vs no_of_previous_cancellations stacked barplot
stacked_barplot(df1, "no_of_previous_cancellations", "booking_status")
booking_status Canceled Not_Canceled All no_of_previous_cancellations All 11885 24390 36275 0 11869 24068 35937 1 11 187 198 13 4 0 4 3 1 42 43 2 0 46 46 4 0 10 10 5 0 11 11 6 0 1 1 11 0 25 25 ------------------------------------------------------------------------------------------------------------------------
# Booking_status vs no_of_previous_cancellations stacked barplot
sns.histplot(df1, x = "no_of_previous_cancellations", hue = "booking_status", kde = True)
<AxesSubplot:xlabel='no_of_previous_cancellations', ylabel='Count'>
# Booking_status vs no_of_previous_bookings_not_canceled stacked barplot
stacked_barplot(df1, "no_of_previous_bookings_not_canceled", "booking_status")
booking_status Canceled Not_Canceled All no_of_previous_bookings_not_canceled All 11885 24390 36275 0 11878 23585 35463 1 4 224 228 12 1 11 12 4 1 64 65 6 1 35 36 2 0 112 112 44 0 2 2 43 0 1 1 42 0 1 1 41 0 1 1 40 0 1 1 38 0 1 1 39 0 1 1 46 0 1 1 37 0 1 1 36 0 1 1 35 0 1 1 45 0 1 1 48 0 2 2 47 0 1 1 33 0 1 1 49 0 1 1 50 0 1 1 51 0 1 1 52 0 1 1 53 0 1 1 54 0 1 1 55 0 1 1 56 0 1 1 57 0 1 1 58 0 1 1 34 0 1 1 31 0 2 2 32 0 2 2 3 0 80 80 5 0 60 60 7 0 24 24 8 0 23 23 9 0 19 19 10 0 19 19 11 0 15 15 13 0 7 7 14 0 9 9 15 0 8 8 16 0 7 7 17 0 6 6 18 0 6 6 19 0 6 6 20 0 6 6 21 0 6 6 22 0 6 6 23 0 3 3 24 0 3 3 25 0 3 3 26 0 2 2 27 0 3 3 28 0 2 2 29 0 2 2 30 0 2 2 ------------------------------------------------------------------------------------------------------------------------
# Booking_status vs no_of_previous_bookings_not_canceled distribution plot
distribution_plot_wrt_target(df1, "no_of_previous_bookings_not_canceled", "booking_status")
# Booking_status vs avg_price_per_room stacked displot
sns.displot(df1, x = "avg_price_per_room", hue = "booking_status", bins = 40)
<seaborn.axisgrid.FacetGrid at 0x22cd86d7370>
# Booking_status vs avg_price_per_room distribution plot
distribution_plot_wrt_target(df1, "avg_price_per_room", "booking_status")
# Booking_status vs no_of_special_requests stacked barplot
stacked_barplot(df1, "no_of_special_requests", "booking_status")
booking_status Canceled Not_Canceled All no_of_special_requests All 11885 24390 36275 0 8545 11232 19777 1 2703 8670 11373 2 637 3727 4364 3 0 675 675 4 0 78 78 5 0 8 8 ------------------------------------------------------------------------------------------------------------------------
# Booking_status vs no_of_special_requests distribution plot
distribution_plot_wrt_target(df1, "no_of_special_requests", "booking_status")
booking_status is imbalanced.Leading Questions:
# Arrival month for a booking as a percentage
labeled_barplot(df1, "arrival_month", perc = True)
print("Per the barplot, the top 3 busiest months by bookings are October (10), September (9), and August (8).")
Per the barplot, the top 3 busiest months by bookings are October (10), September (9), and August (8).
# Market segment type as a percentage for a booking
labeled_barplot(df1, "market_segment_type", perc = True)
print("Per the barplot, the top market segment most of the guests come from is Online.")
Per the barplot, the top market segment most of the guests come from is Online.
# avg_price_per_room vs market_segment_type
sns.boxplot(data = df1, y = "market_segment_type", x = "avg_price_per_room")
plt.show()
# Booking status per bookings as a percentage
labeled_barplot(df1, "booking_status", perc = True)
print("Per the barplot, 32.8% of bookings are canceled.")
Per the barplot, 32.8% of bookings are canceled.
# Booking_status vs repeated_guest stacked barplot
stacked_barplot(df1, "repeated_guest", "booking_status")
percent_repeat_cancel = 16/930
percent_repeat_cancel = round(percent_repeat_cancel, 3)
print("The percentage of repeat guests that canceled is " + str(percent_repeat_cancel) + "%")
booking_status Canceled Not_Canceled All repeated_guest All 11885 24390 36275 0 11869 23476 35345 1 16 914 930 ------------------------------------------------------------------------------------------------------------------------
The percentage of repeat guests that canceled is 0.017%
# Booking_status vs no_of_special_requests stacked barplot
stacked_barplot(df1, "no_of_special_requests", "booking_status")
print("As seen on the barplot, as the number of special requests a guest has increases, the chances of them canceling the booking decreases 100% when the number of requests are 3 or more.")
booking_status Canceled Not_Canceled All no_of_special_requests All 11885 24390 36275 0 8545 11232 19777 1 2703 8670 11373 2 637 3727 4364 3 0 675 675 4 0 78 78 5 0 8 8 ------------------------------------------------------------------------------------------------------------------------
As seen on the barplot, as the number of special requests a guest has increases, the chances of them canceling the booking decreases 100% when the number of requests are 3 or more.
# Checking for any missing values
df1.isnull().sum()
Booking_ID 0 no_of_adults 0 no_of_children 0 no_of_weekend_nights 0 no_of_week_nights 0 type_of_meal_plan 0 required_car_parking_space 0 room_type_reserved 0 lead_time 0 arrival_year 0 arrival_month 0 arrival_date 0 market_segment_type 0 repeated_guest 0 no_of_previous_cancellations 0 no_of_previous_bookings_not_canceled 0 avg_price_per_room 0 no_of_special_requests 0 booking_status 0 dtype: int64
# Right now the dataframe separates out adults and children for booking. To help speed up computation, going to combine the features into one called no_of_people
# Creating a new column called num_of_people that will capture total people for a booking
df1["no_of_people"] = df1["no_of_adults"] + df1["no_of_children"]
# Dropping no_of_adults and no_of_children
df1.drop(["no_of_adults", "no_of_children"], axis = 1, inplace = True)
# Verifying no_of_people column is created and no_of_adults and no_of_children are removed
df1
| Booking_ID | no_of_weekend_nights | no_of_week_nights | type_of_meal_plan | required_car_parking_space | room_type_reserved | lead_time | arrival_year | arrival_month | arrival_date | market_segment_type | repeated_guest | no_of_previous_cancellations | no_of_previous_bookings_not_canceled | avg_price_per_room | no_of_special_requests | booking_status | no_of_people | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | INN00001 | 1 | 2 | Meal Plan 1 | 0 | Room_Type 1 | 224 | 2017 | 10 | 2 | Offline | 0 | 0 | 0 | 65.00 | 0 | Not_Canceled | 2 |
| 1 | INN00002 | 2 | 3 | Not Selected | 0 | Room_Type 1 | 5 | 2018 | 11 | 6 | Online | 0 | 0 | 0 | 106.68 | 1 | Not_Canceled | 2 |
| 2 | INN00003 | 2 | 1 | Meal Plan 1 | 0 | Room_Type 1 | 1 | 2018 | 2 | 28 | Online | 0 | 0 | 0 | 60.00 | 0 | Canceled | 1 |
| 3 | INN00004 | 0 | 2 | Meal Plan 1 | 0 | Room_Type 1 | 211 | 2018 | 5 | 20 | Online | 0 | 0 | 0 | 100.00 | 0 | Canceled | 2 |
| 4 | INN00005 | 1 | 1 | Not Selected | 0 | Room_Type 1 | 48 | 2018 | 4 | 11 | Online | 0 | 0 | 0 | 94.50 | 0 | Canceled | 2 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 36270 | INN36271 | 2 | 6 | Meal Plan 1 | 0 | Room_Type 4 | 85 | 2018 | 8 | 3 | Online | 0 | 0 | 0 | 167.80 | 1 | Not_Canceled | 3 |
| 36271 | INN36272 | 1 | 3 | Meal Plan 1 | 0 | Room_Type 1 | 228 | 2018 | 10 | 17 | Online | 0 | 0 | 0 | 90.95 | 2 | Canceled | 2 |
| 36272 | INN36273 | 2 | 6 | Meal Plan 1 | 0 | Room_Type 1 | 148 | 2018 | 7 | 1 | Online | 0 | 0 | 0 | 98.39 | 2 | Not_Canceled | 2 |
| 36273 | INN36274 | 0 | 3 | Not Selected | 0 | Room_Type 1 | 63 | 2018 | 4 | 21 | Online | 0 | 0 | 0 | 94.50 | 0 | Canceled | 2 |
| 36274 | INN36275 | 1 | 2 | Meal Plan 1 | 0 | Room_Type 1 | 207 | 2018 | 12 | 30 | Offline | 0 | 0 | 0 | 161.67 | 0 | Not_Canceled | 2 |
36275 rows × 18 columns
# Right now the dataframe separates out number of weekend nights and number of week nights. Since there is no special rate for weekend/weeknight (dataset refers to the average price of the room), going to combine the features into one called no_of_days_stayed to help speed up computation.
# Creating a new column called no_of_days_stayed that will capture total duration of a booking
df1["no_of_days_stayed"] = df1["no_of_weekend_nights"] + df1["no_of_week_nights"]
# Dropping no_of_adults and no_of_children
df1.drop(["no_of_weekend_nights", "no_of_week_nights"], axis = 1, inplace = True)
# Verifying no_of_people column is created and no_of_weekend_nights and no_of_week_nights are removed
df1
| Booking_ID | type_of_meal_plan | required_car_parking_space | room_type_reserved | lead_time | arrival_year | arrival_month | arrival_date | market_segment_type | repeated_guest | no_of_previous_cancellations | no_of_previous_bookings_not_canceled | avg_price_per_room | no_of_special_requests | booking_status | no_of_people | no_of_days_stayed | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | INN00001 | Meal Plan 1 | 0 | Room_Type 1 | 224 | 2017 | 10 | 2 | Offline | 0 | 0 | 0 | 65.00 | 0 | Not_Canceled | 2 | 3 |
| 1 | INN00002 | Not Selected | 0 | Room_Type 1 | 5 | 2018 | 11 | 6 | Online | 0 | 0 | 0 | 106.68 | 1 | Not_Canceled | 2 | 5 |
| 2 | INN00003 | Meal Plan 1 | 0 | Room_Type 1 | 1 | 2018 | 2 | 28 | Online | 0 | 0 | 0 | 60.00 | 0 | Canceled | 1 | 3 |
| 3 | INN00004 | Meal Plan 1 | 0 | Room_Type 1 | 211 | 2018 | 5 | 20 | Online | 0 | 0 | 0 | 100.00 | 0 | Canceled | 2 | 2 |
| 4 | INN00005 | Not Selected | 0 | Room_Type 1 | 48 | 2018 | 4 | 11 | Online | 0 | 0 | 0 | 94.50 | 0 | Canceled | 2 | 2 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 36270 | INN36271 | Meal Plan 1 | 0 | Room_Type 4 | 85 | 2018 | 8 | 3 | Online | 0 | 0 | 0 | 167.80 | 1 | Not_Canceled | 3 | 8 |
| 36271 | INN36272 | Meal Plan 1 | 0 | Room_Type 1 | 228 | 2018 | 10 | 17 | Online | 0 | 0 | 0 | 90.95 | 2 | Canceled | 2 | 4 |
| 36272 | INN36273 | Meal Plan 1 | 0 | Room_Type 1 | 148 | 2018 | 7 | 1 | Online | 0 | 0 | 0 | 98.39 | 2 | Not_Canceled | 2 | 8 |
| 36273 | INN36274 | Not Selected | 0 | Room_Type 1 | 63 | 2018 | 4 | 21 | Online | 0 | 0 | 0 | 94.50 | 0 | Canceled | 2 | 3 |
| 36274 | INN36275 | Meal Plan 1 | 0 | Room_Type 1 | 207 | 2018 | 12 | 30 | Offline | 0 | 0 | 0 | 161.67 | 0 | Not_Canceled | 2 | 3 |
36275 rows × 17 columns
# Outlier detection using boxplot
numeric_columns = df1.select_dtypes(include=np.number).columns.tolist()
print(numeric_columns)
plt.figure(figsize=(15, 15))
for i, variable in enumerate(numeric_columns):
plt.subplot(4, 4, i + 1)
plt.boxplot(df1[variable], whis=1.5)
plt.tight_layout()
plt.title(variable)
plt.show()
['required_car_parking_space', 'lead_time', 'arrival_year', 'arrival_month', 'arrival_date', 'repeated_guest', 'no_of_previous_cancellations', 'no_of_previous_bookings_not_canceled', 'avg_price_per_room', 'no_of_special_requests', 'no_of_people', 'no_of_days_stayed']
# Converting booking status values into numerical values since output range for a logistic regression model is between 0 and 1
# Canceled will be encoded as 1, and Not_Canceled will be encoded as 0
convert = {"Not_Canceled" : 0, "Canceled" : 1}
df2 = df1.replace({"booking_status": convert})
# Confirming datatype of booking_status is numerical
print(df2.booking_status.dtype)
# Confirming conversion of booking type to numerical is successful
df2
int64
| Booking_ID | type_of_meal_plan | required_car_parking_space | room_type_reserved | lead_time | arrival_year | arrival_month | arrival_date | market_segment_type | repeated_guest | no_of_previous_cancellations | no_of_previous_bookings_not_canceled | avg_price_per_room | no_of_special_requests | booking_status | no_of_people | no_of_days_stayed | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | INN00001 | Meal Plan 1 | 0 | Room_Type 1 | 224 | 2017 | 10 | 2 | Offline | 0 | 0 | 0 | 65.00 | 0 | 0 | 2 | 3 |
| 1 | INN00002 | Not Selected | 0 | Room_Type 1 | 5 | 2018 | 11 | 6 | Online | 0 | 0 | 0 | 106.68 | 1 | 0 | 2 | 5 |
| 2 | INN00003 | Meal Plan 1 | 0 | Room_Type 1 | 1 | 2018 | 2 | 28 | Online | 0 | 0 | 0 | 60.00 | 0 | 1 | 1 | 3 |
| 3 | INN00004 | Meal Plan 1 | 0 | Room_Type 1 | 211 | 2018 | 5 | 20 | Online | 0 | 0 | 0 | 100.00 | 0 | 1 | 2 | 2 |
| 4 | INN00005 | Not Selected | 0 | Room_Type 1 | 48 | 2018 | 4 | 11 | Online | 0 | 0 | 0 | 94.50 | 0 | 1 | 2 | 2 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 36270 | INN36271 | Meal Plan 1 | 0 | Room_Type 4 | 85 | 2018 | 8 | 3 | Online | 0 | 0 | 0 | 167.80 | 1 | 0 | 3 | 8 |
| 36271 | INN36272 | Meal Plan 1 | 0 | Room_Type 1 | 228 | 2018 | 10 | 17 | Online | 0 | 0 | 0 | 90.95 | 2 | 1 | 2 | 4 |
| 36272 | INN36273 | Meal Plan 1 | 0 | Room_Type 1 | 148 | 2018 | 7 | 1 | Online | 0 | 0 | 0 | 98.39 | 2 | 0 | 2 | 8 |
| 36273 | INN36274 | Not Selected | 0 | Room_Type 1 | 63 | 2018 | 4 | 21 | Online | 0 | 0 | 0 | 94.50 | 0 | 1 | 2 | 3 |
| 36274 | INN36275 | Meal Plan 1 | 0 | Room_Type 1 | 207 | 2018 | 12 | 30 | Offline | 0 | 0 | 0 | 161.67 | 0 | 0 | 2 | 3 |
36275 rows × 17 columns
# Dropping "Booking_ID" from dataframe since it is nominal variable that is unique for each booking and doesn't have any predictive power.
df2 = df2.drop("Booking_ID", axis=1)
# Preparing data for both logistic regression and decision tree modeling
# Splitting data for training and validation
X = df2.drop("booking_status", axis=1)
Y = df2["booking_status"]
# Importing warnings to ignore a Future Warning associated with get_dummies
import warnings
warnings.simplefilter(action = 'ignore', category = FutureWarning)
# Creating dummy variables
X = pd.get_dummies(X, drop_first=True)
# Adding constant for logistic regression model
X = sm.add_constant(X)
# Splitting into training and test sets (70% train, 30% test)
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size = 0.3, random_state = 1)
print(X_train.shape, X_test.shape)
(25392, 26) (10883, 26)
print("Number of rows in train data =", X_train.shape[0])
print("Number of rows in test data =", X_test.shape[0])
Number of rows in train data = 25392 Number of rows in test data = 10883
print("Percentage of classes in training set:")
print(y_train.value_counts(normalize=True))
print("\n")
print("Percentage of classes in test set:")
print(y_test.value_counts(normalize=True))
Percentage of classes in training set: 0 0.670644 1 0.329356 Name: booking_status, dtype: float64 Percentage of classes in test set: 0 0.676376 1 0.323624 Name: booking_status, dtype: float64
# Booking status for a booking
labeled_barplot(df2, "booking_status")
# Booking status for a booking
labeled_barplot(df2, "booking_status", perc = True)
# Number of people for a booking
labeled_barplot(df2, "no_of_people")
# Number of people for a booking as a percentage
labeled_barplot(df2, "no_of_people", perc = True)
# Number of people for a booking histogram and boxplot
histogram_boxplot(df2, "no_of_people")
# Number of days stayed for a booking
labeled_barplot(df2, "no_of_days_stayed")
# Number of days stayed for a booking as a percentage
labeled_barplot(df2, "no_of_days_stayed", perc = True)
# Number of days stayed for a booking histogram and boxplot
histogram_boxplot(df2, "no_of_days_stayed")
# Correlation heatmap for all numerical variables
plt.figure(figsize=(15, 7))
sns.heatmap(df2.corr(), annot=True, vmin=-1, vmax=1, fmt=".2f", cmap="Spectral")
plt.show()
# Booking_status vs no_of_people stacked barplot
stacked_barplot(df1, "no_of_people", "booking_status")
booking_status Canceled Not_Canceled All no_of_people All 11885 24390 36275 2 8280 15662 23942 1 1809 5743 7552 3 1392 2459 3851 4 398 514 912 5 5 10 15 11 1 0 1 10 0 1 1 12 0 1 1 ------------------------------------------------------------------------------------------------------------------------
# Booking_status vs no_of_people distribution plot
distribution_plot_wrt_target(df1, "no_of_people", "booking_status")
# Booking_status vs no_days_stayed stacked barplot
stacked_barplot(df1, "no_of_days_stayed", "booking_status")
booking_status Canceled Not_Canceled All no_of_days_stayed All 11885 24390 36275 3 3586 6466 10052 2 2899 5573 8472 4 1941 3952 5893 1 1466 5138 6604 5 823 1766 2589 6 465 566 1031 7 383 590 973 8 79 100 179 10 58 51 109 9 53 58 111 14 27 5 32 15 26 5 31 13 15 3 18 12 15 9 24 11 15 24 39 20 8 3 11 16 5 1 6 19 5 1 6 17 4 1 5 18 3 0 3 21 3 1 4 22 2 0 2 0 2 76 78 23 1 1 2 24 1 0 1 ------------------------------------------------------------------------------------------------------------------------
# Booking_status vs no_of_days_stayed distribution plot
distribution_plot_wrt_target(df1, "no_of_days_stayed", "booking_status")
One way to check for multicollinearity is look at Variance Inflation Factor (VIF) values.
Variance Inflation Factor: Variance inflation Factor measures the inflation in the variances of the regression coefficients estimates due to collinearities that exist among the predictors. It is a measure of how much the variance of the estimated regression coefficient βk is “inflated”by the existence of correlation among the predictor variables in the model.
General Rule of thumb:
# Calculating VIF values for different features
vif = pd.DataFrame()
vif["VIF Factor"] = [variance_inflation_factor(X.values, i) for i in range(X.shape[1])]
vif["features"] = X.columns
vif.round(2)
| VIF Factor | features | |
|---|---|---|
| 0 | 39412871.11 | const |
| 1 | 1.04 | required_car_parking_space |
| 2 | 1.38 | lead_time |
| 3 | 1.43 | arrival_year |
| 4 | 1.27 | arrival_month |
| 5 | 1.01 | arrival_date |
| 6 | 1.76 | repeated_guest |
| 7 | 1.35 | no_of_previous_cancellations |
| 8 | 1.61 | no_of_previous_bookings_not_canceled |
| 9 | 2.03 | avg_price_per_room |
| 10 | 1.25 | no_of_special_requests |
| 11 | 1.65 | no_of_people |
| 12 | 1.10 | no_of_days_stayed |
| 13 | 1.26 | type_of_meal_plan_Meal Plan 2 |
| 14 | 1.02 | type_of_meal_plan_Meal Plan 3 |
| 15 | 1.26 | type_of_meal_plan_Not Selected |
| 16 | 1.04 | room_type_reserved_Room_Type 2 |
| 17 | 1.00 | room_type_reserved_Room_Type 3 |
| 18 | 1.31 | room_type_reserved_Room_Type 4 |
| 19 | 1.03 | room_type_reserved_Room_Type 5 |
| 20 | 1.49 | room_type_reserved_Room_Type 6 |
| 21 | 1.10 | room_type_reserved_Room_Type 7 |
| 22 | 4.40 | market_segment_type_Complementary |
| 23 | 16.54 | market_segment_type_Corporate |
| 24 | 62.36 | market_segment_type_Offline |
| 25 | 69.22 | market_segment_type_Online |
# Defining a function to compute different metrics to check performance of a classification model built using statsmodels
def model_performance_classification_statsmodels(
model, predictors, target, threshold=0.5
):
"""
Function to compute different metrics to check classification model performance
model: classifier
predictors: independent variables
target: dependent variable
threshold: threshold for classifying the observation as class 1
"""
# checking which probabilities are greater than threshold
pred_temp = model.predict(predictors) > threshold
# rounding off the above values to get classes
pred = np.round(pred_temp)
acc = accuracy_score(target, pred) # to compute Accuracy
recall = recall_score(target, pred) # to compute Recall
precision = precision_score(target, pred) # to compute Precision
f1 = f1_score(target, pred) # to compute F1-score
# creating a dataframe of metrics
df_perf = pd.DataFrame(
{"Accuracy": acc, "Recall": recall, "Precision": precision, "F1": f1,},
index=[0],
)
return df_perf
# Defining a function to plot the confusion_matrix of a classification model
def confusion_matrix_statsmodels(model, predictors, target, threshold=0.5):
"""
To plot the confusion_matrix with percentages
model: classifier
predictors: independent variables
target: dependent variable
threshold: threshold for classifying the observation as class 1
"""
y_pred = model.predict(predictors) > threshold
cm = confusion_matrix(target, y_pred)
labels = np.asarray(
[
["{0:0.0f}".format(item) + "\n{0:.2%}".format(item / cm.flatten().sum())]
for item in cm.flatten()
]
).reshape(2, 2)
plt.figure(figsize=(6, 4))
sns.heatmap(cm, annot=labels, fmt="")
plt.ylabel("True label")
plt.xlabel("Predicted label")
# Fitting logistic regression model
logit = sm.Logit(y_train, X_train.astype(float))
lg = logit.fit(disp=False)
print(lg.summary())
Logit Regression Results
==============================================================================
Dep. Variable: booking_status No. Observations: 25392
Model: Logit Df Residuals: 25366
Method: MLE Df Model: 25
Date: Fri, 25 Feb 2022 Pseudo R-squ.: 0.3291
Time: 22:11:19 Log-Likelihood: -10796.
converged: False LL-Null: -16091.
Covariance Type: nonrobust LLR p-value: 0.000
========================================================================================================
coef std err z P>|z| [0.025 0.975]
--------------------------------------------------------------------------------------------------------
const -928.3326 120.669 -7.693 0.000 -1164.840 -691.825
required_car_parking_space -1.5941 0.138 -11.573 0.000 -1.864 -1.324
lead_time 0.0156 0.000 58.899 0.000 0.015 0.016
arrival_year 0.4588 0.060 7.673 0.000 0.342 0.576
arrival_month -0.0416 0.006 -6.430 0.000 -0.054 -0.029
arrival_date 0.0006 0.002 0.323 0.747 -0.003 0.004
repeated_guest -2.3373 0.618 -3.782 0.000 -3.549 -1.126
no_of_previous_cancellations 0.2659 0.085 3.115 0.002 0.099 0.433
no_of_previous_bookings_not_canceled -0.1730 0.153 -1.129 0.259 -0.473 0.127
avg_price_per_room 0.0187 0.001 25.327 0.000 0.017 0.020
no_of_special_requests -1.4682 0.030 -48.777 0.000 -1.527 -1.409
no_of_people 0.1284 0.033 3.861 0.000 0.063 0.194
no_of_days_stayed 0.0605 0.010 6.351 0.000 0.042 0.079
type_of_meal_plan_Meal Plan 2 0.1815 0.067 2.728 0.006 0.051 0.312
type_of_meal_plan_Meal Plan 3 19.6676 1.27e+04 0.002 0.999 -2.48e+04 2.49e+04
type_of_meal_plan_Not Selected 0.2738 0.053 5.189 0.000 0.170 0.377
room_type_reserved_Room_Type 2 -0.3411 0.127 -2.680 0.007 -0.590 -0.092
room_type_reserved_Room_Type 3 -0.0004 1.306 -0.000 1.000 -2.561 2.560
room_type_reserved_Room_Type 4 -0.2911 0.052 -5.583 0.000 -0.393 -0.189
room_type_reserved_Room_Type 5 -0.7131 0.209 -3.414 0.001 -1.122 -0.304
room_type_reserved_Room_Type 6 -0.8973 0.128 -7.025 0.000 -1.148 -0.647
room_type_reserved_Room_Type 7 -1.3763 0.290 -4.740 0.000 -1.945 -0.807
market_segment_type_Complementary -47.7405 6.07e+06 -7.86e-06 1.000 -1.19e+07 1.19e+07
market_segment_type_Corporate -1.2145 0.266 -4.571 0.000 -1.735 -0.694
market_segment_type_Offline -2.2104 0.254 -8.703 0.000 -2.708 -1.713
market_segment_type_Online -0.4095 0.251 -1.633 0.102 -0.901 0.082
========================================================================================================
# Model performance evaluation on initial X_train
print("Training performance:")
model_performance_classification_statsmodels(lg, X_train, y_train)
Training performance:
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.804939 | 0.629439 | 0.739534 | 0.680059 |
# Dropping predictor columns with highest p-values > 0.05 one by one (ignoring dummy variables)
# First column to drop is arrival_date (p-value = 0.747)
drop_col = "arrival_date"
X_train1 = X_train.drop(drop_col, axis=1)
# Checking if p-values for other variables became < 0.05
print('p-values after dropping', drop_col)
logit1 = sm.Logit(y_train, X_train1.astype(float))
lg1 = logit1.fit(disp=False)
print(lg1.summary())
p-values after dropping arrival_date
Logit Regression Results
==============================================================================
Dep. Variable: booking_status No. Observations: 25392
Model: Logit Df Residuals: 25367
Method: MLE Df Model: 24
Date: Fri, 25 Feb 2022 Pseudo R-squ.: 0.3291
Time: 22:11:19 Log-Likelihood: -10796.
converged: False LL-Null: -16091.
Covariance Type: nonrobust LLR p-value: 0.000
========================================================================================================
coef std err z P>|z| [0.025 0.975]
--------------------------------------------------------------------------------------------------------
const -928.2536 120.675 -7.692 0.000 -1164.773 -691.734
required_car_parking_space -1.5943 0.138 -11.574 0.000 -1.864 -1.324
lead_time 0.0157 0.000 58.911 0.000 0.015 0.016
arrival_year 0.4588 0.060 7.672 0.000 0.342 0.576
arrival_month -0.0417 0.006 -6.458 0.000 -0.054 -0.029
repeated_guest -2.3387 0.618 -3.783 0.000 -3.550 -1.127
no_of_previous_cancellations 0.2657 0.085 3.112 0.002 0.098 0.433
no_of_previous_bookings_not_canceled -0.1727 0.153 -1.127 0.260 -0.473 0.128
avg_price_per_room 0.0187 0.001 25.327 0.000 0.017 0.020
no_of_special_requests -1.4680 0.030 -48.786 0.000 -1.527 -1.409
no_of_people 0.1287 0.033 3.872 0.000 0.064 0.194
no_of_days_stayed 0.0605 0.010 6.353 0.000 0.042 0.079
type_of_meal_plan_Meal Plan 2 0.1820 0.067 2.736 0.006 0.052 0.312
type_of_meal_plan_Meal Plan 3 19.8598 1.39e+04 0.001 0.999 -2.73e+04 2.73e+04
type_of_meal_plan_Not Selected 0.2740 0.053 5.193 0.000 0.171 0.377
room_type_reserved_Room_Type 2 -0.3404 0.127 -2.675 0.007 -0.590 -0.091
room_type_reserved_Room_Type 3 -0.0024 1.306 -0.002 0.999 -2.563 2.558
room_type_reserved_Room_Type 4 -0.2908 0.052 -5.578 0.000 -0.393 -0.189
room_type_reserved_Room_Type 5 -0.7127 0.209 -3.413 0.001 -1.122 -0.303
room_type_reserved_Room_Type 6 -0.8973 0.128 -7.025 0.000 -1.148 -0.647
room_type_reserved_Room_Type 7 -1.3758 0.290 -4.738 0.000 -1.945 -0.807
market_segment_type_Complementary -48.3733 7.57e+06 -6.39e-06 1.000 -1.48e+07 1.48e+07
market_segment_type_Corporate -1.2136 0.266 -4.567 0.000 -1.734 -0.693
market_segment_type_Offline -2.2111 0.254 -8.705 0.000 -2.709 -1.713
market_segment_type_Online -0.4098 0.251 -1.634 0.102 -0.901 0.082
========================================================================================================
# Dropping predictor columns with highest p-values > 0.05 one by one (ignoring dummy variables)
# Second column to drop is no_of_previous_bookings_not_canceled (p-value = 0.260)
drop_col = "no_of_previous_bookings_not_canceled"
X_train2 = X_train1.drop(drop_col, axis=1)
# Checking if p-values for other variables became < 0.05
print('p-values after dropping', drop_col)
logit2 = sm.Logit(y_train, X_train2.astype(float))
lg2 = logit2.fit(disp=False)
print(lg2.summary())
p-values after dropping no_of_previous_bookings_not_canceled
Logit Regression Results
==============================================================================
Dep. Variable: booking_status No. Observations: 25392
Model: Logit Df Residuals: 25368
Method: MLE Df Model: 23
Date: Fri, 25 Feb 2022 Pseudo R-squ.: 0.3290
Time: 22:11:19 Log-Likelihood: -10798.
converged: False LL-Null: -16091.
Covariance Type: nonrobust LLR p-value: 0.000
=====================================================================================================
coef std err z P>|z| [0.025 0.975]
-----------------------------------------------------------------------------------------------------
const -926.5791 120.684 -7.678 0.000 -1163.116 -690.042
required_car_parking_space -1.5936 0.138 -11.569 0.000 -1.864 -1.324
lead_time 0.0157 0.000 58.954 0.000 0.015 0.016
arrival_year 0.4580 0.060 7.658 0.000 0.341 0.575
arrival_month -0.0417 0.006 -6.457 0.000 -0.054 -0.029
repeated_guest -2.7417 0.558 -4.914 0.000 -3.835 -1.648
no_of_previous_cancellations 0.2295 0.077 2.997 0.003 0.079 0.380
avg_price_per_room 0.0187 0.001 25.338 0.000 0.017 0.020
no_of_special_requests -1.4686 0.030 -48.804 0.000 -1.528 -1.410
no_of_people 0.1287 0.033 3.871 0.000 0.064 0.194
no_of_days_stayed 0.0605 0.010 6.349 0.000 0.042 0.079
type_of_meal_plan_Meal Plan 2 0.1810 0.067 2.721 0.006 0.051 0.311
type_of_meal_plan_Meal Plan 3 20.0754 1.55e+04 0.001 0.999 -3.04e+04 3.04e+04
type_of_meal_plan_Not Selected 0.2741 0.053 5.194 0.000 0.171 0.378
room_type_reserved_Room_Type 2 -0.3407 0.127 -2.677 0.007 -0.590 -0.091
room_type_reserved_Room_Type 3 -0.0032 1.306 -0.002 0.998 -2.564 2.557
room_type_reserved_Room_Type 4 -0.2910 0.052 -5.582 0.000 -0.393 -0.189
room_type_reserved_Room_Type 5 -0.7128 0.209 -3.414 0.001 -1.122 -0.304
room_type_reserved_Room_Type 6 -0.8978 0.128 -7.028 0.000 -1.148 -0.647
room_type_reserved_Room_Type 7 -1.3764 0.290 -4.740 0.000 -1.946 -0.807
market_segment_type_Complementary -33.5117 1.6e+04 -0.002 0.998 -3.15e+04 3.14e+04
market_segment_type_Corporate -1.2211 0.266 -4.594 0.000 -1.742 -0.700
market_segment_type_Offline -2.2137 0.254 -8.712 0.000 -2.712 -1.716
market_segment_type_Online -0.4124 0.251 -1.644 0.100 -0.904 0.079
=====================================================================================================
# After dropping arrival_date and no_of_previous_bookings_not_canceled, remaining predictor columns (ignoring dummy variables) now have p-values < 0.05
# Building a new logistic regression model with X_train2
logit2 = sm.Logit(y_train, X_train2.astype(float))
lg2 = logit2.fit(disp=False)
print(lg2.summary())
Logit Regression Results
==============================================================================
Dep. Variable: booking_status No. Observations: 25392
Model: Logit Df Residuals: 25368
Method: MLE Df Model: 23
Date: Fri, 25 Feb 2022 Pseudo R-squ.: 0.3290
Time: 22:11:20 Log-Likelihood: -10798.
converged: False LL-Null: -16091.
Covariance Type: nonrobust LLR p-value: 0.000
=====================================================================================================
coef std err z P>|z| [0.025 0.975]
-----------------------------------------------------------------------------------------------------
const -926.5791 120.684 -7.678 0.000 -1163.116 -690.042
required_car_parking_space -1.5936 0.138 -11.569 0.000 -1.864 -1.324
lead_time 0.0157 0.000 58.954 0.000 0.015 0.016
arrival_year 0.4580 0.060 7.658 0.000 0.341 0.575
arrival_month -0.0417 0.006 -6.457 0.000 -0.054 -0.029
repeated_guest -2.7417 0.558 -4.914 0.000 -3.835 -1.648
no_of_previous_cancellations 0.2295 0.077 2.997 0.003 0.079 0.380
avg_price_per_room 0.0187 0.001 25.338 0.000 0.017 0.020
no_of_special_requests -1.4686 0.030 -48.804 0.000 -1.528 -1.410
no_of_people 0.1287 0.033 3.871 0.000 0.064 0.194
no_of_days_stayed 0.0605 0.010 6.349 0.000 0.042 0.079
type_of_meal_plan_Meal Plan 2 0.1810 0.067 2.721 0.006 0.051 0.311
type_of_meal_plan_Meal Plan 3 20.0754 1.55e+04 0.001 0.999 -3.04e+04 3.04e+04
type_of_meal_plan_Not Selected 0.2741 0.053 5.194 0.000 0.171 0.378
room_type_reserved_Room_Type 2 -0.3407 0.127 -2.677 0.007 -0.590 -0.091
room_type_reserved_Room_Type 3 -0.0032 1.306 -0.002 0.998 -2.564 2.557
room_type_reserved_Room_Type 4 -0.2910 0.052 -5.582 0.000 -0.393 -0.189
room_type_reserved_Room_Type 5 -0.7128 0.209 -3.414 0.001 -1.122 -0.304
room_type_reserved_Room_Type 6 -0.8978 0.128 -7.028 0.000 -1.148 -0.647
room_type_reserved_Room_Type 7 -1.3764 0.290 -4.740 0.000 -1.946 -0.807
market_segment_type_Complementary -33.5117 1.6e+04 -0.002 0.998 -3.15e+04 3.14e+04
market_segment_type_Corporate -1.2211 0.266 -4.594 0.000 -1.742 -0.700
market_segment_type_Offline -2.2137 0.254 -8.712 0.000 -2.712 -1.716
market_segment_type_Online -0.4124 0.251 -1.644 0.100 -0.904 0.079
=====================================================================================================
# Converting coefficients to odds
odds = np.exp(lg2.params)
# Finding the percentage change
perc_change_odds = (np.exp(lg2.params) - 1) * 100
# Removing limit from number of columns to display
pd.set_option("display.max_columns", None)
# Adding the odds to a dataframe
pd.DataFrame({"Odds": odds, "Change_odd%": perc_change_odds}, index=X_train2.columns).T
| const | required_car_parking_space | lead_time | arrival_year | arrival_month | repeated_guest | no_of_previous_cancellations | avg_price_per_room | no_of_special_requests | no_of_people | no_of_days_stayed | type_of_meal_plan_Meal Plan 2 | type_of_meal_plan_Meal Plan 3 | type_of_meal_plan_Not Selected | room_type_reserved_Room_Type 2 | room_type_reserved_Room_Type 3 | room_type_reserved_Room_Type 4 | room_type_reserved_Room_Type 5 | room_type_reserved_Room_Type 6 | room_type_reserved_Room_Type 7 | market_segment_type_Complementary | market_segment_type_Corporate | market_segment_type_Offline | market_segment_type_Online | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Odds | 0.0 | 0.203186 | 1.015782 | 1.580834 | 0.959150 | 0.064462 | 1.257912 | 1.018876 | 0.230258 | 1.137322 | 1.062335 | 1.198417 | 5.231699e+08 | 1.315327 | 0.711305 | 0.996855 | 0.747500 | 0.490268 | 0.407464 | 0.252474 | 2.792858e-15 | 0.294912 | 0.109292 | 0.662046 |
| Change_odd% | -100.0 | -79.681395 | 1.578226 | 58.083417 | -4.085031 | -93.553767 | 25.791192 | 1.887620 | -76.974163 | 13.732156 | 6.233493 | 19.841679 | 5.231699e+10 | 31.532705 | -28.869544 | -0.314541 | -25.250026 | -50.973184 | -59.253642 | -74.752550 | -1.000000e+02 | -70.508789 | -89.070769 | -33.795434 |
required_car_parking_space: Holding all other features constant, a unit change in required_car_parking_space will decrease the odds of a person canceling the booking by 0.203 times or a 79.68% decrease in the odds of a booking getting canceled.
lead_time: Holding all other features constant, a unit change in lead_time will increase the odds of a person canceling the booking by 1.016 times or a 1.58% increase in the odds of a booking getting canceled.
arrival_year: Holding all other features constant, a unit change in arrival_year will increase the odds of a person canceling the booking by 1.581 times or a 58.08% increase in the odds of a booking getting canceled.
arrival_month: Holding all other features constant, a unit change in arrival_month will decrease the odds of a person canceling the booking by 0.959 times or a 4.09% decrease in the odds of a booking getting canceled.
repeated_guest: Holding all other features constant, a unit change in repeated_guest will decrease the odds of a person canceling the booking by 0.064 times or a 93.55% decrease in the odds of a booking getting canceled.
no_of_previous_cancelations: Holding all other features constant, a unit change in no_of_previous_cancelations will increase the odds of a person canceling the booking by 1.258 times or a 25.79% increase in the odds of a booking getting canceled.
avg_price_per_room: Holding all other features constant, a unit change in avg_price_per_room will increase the odds of a person canceling the booking by 1.019 times or a 1.89% increase in the odds of a booking getting canceled.
no_of_special_requests: Holding all other features constant, a unit change in no_of_special_requests will decrease the odds of a person canceling the booking by 0.230 times or a 76.97% decrease in the odds of a booking getting canceled.
no_of_people: Holding all other features constant, a unit change in no_of_people will increase the odds of a person canceling the booking by 1.137 times or a 13.73% increase in the odds of a booking getting canceled.
no_of_days_stayed: Holding all other features constant, a unit change in no_of_days_stayed will increase the odds of a person canceling the booking by 1.062 times or a 6.23% increase in the odds of a booking getting canceled.
# Creating confusion matrix to check performance on training set
confusion_matrix_statsmodels(lg2, X_train2, y_train)
# Checking model performance on X_train2 and y_train
log_reg_model_train_perf = model_performance_classification_statsmodels(lg2, X_train2, y_train)
print("Training performance:")
log_reg_model_train_perf
Training performance:
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.804624 | 0.628124 | 0.739443 | 0.679253 |
# Generating ROC curve with lg2 and X_train2
logit_roc_auc_train = roc_auc_score(y_train, lg2.predict(X_train2))
fpr, tpr, thresholds = roc_curve(y_train, lg2.predict(X_train2))
plt.figure(figsize = (7, 5))
plt.plot(fpr, tpr, label = "Logistic Regression (area = %0.2f)" % logit_roc_auc_train)
plt.plot([0, 1], [0, 1], "r--")
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel("False Positive Rate")
plt.ylabel("True Positive Rate")
plt.title("Receiver operating characteristic")
plt.legend(loc = "lower right")
plt.show()
# Looking to improve recall score by changing the model threshold per AUC-ROC curve
# The optimal cut off would be where tpr is high and fpr is low
fpr, tpr, thresholds = roc_curve(y_train, lg2.predict(X_train2))
optimal_idx = np.argmax(tpr - fpr)
optimal_threshold_auc_roc = thresholds[optimal_idx]
print(optimal_threshold_auc_roc)
0.3490219110870503
# Creating confusion matrix
confusion_matrix_statsmodels(lg2, X_train2, y_train, threshold = optimal_threshold_auc_roc)
# Checking model performance for this model with AUC-ROC optimal threshold
log_reg_model_train_perf_threshold_auc_roc = model_performance_classification_statsmodels(
lg2, X_train2, y_train, threshold = optimal_threshold_auc_roc
)
print("Training performance:")
log_reg_model_train_perf_threshold_auc_roc
Training performance:
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.787138 | 0.75284 | 0.653519 | 0.699672 |
# Looking to see if a better threshold can be found with Precision-Recall Curve
y_scores = lg2.predict(X_train2)
prec, rec, tre = precision_recall_curve(y_train, y_scores)
def plot_prec_recall_vs_tresh(precisions, recalls, thresholds):
plt.plot(thresholds, precisions[:-1], "b--", label="precision")
plt.plot(thresholds, recalls[:-1], "g--", label="recall")
plt.xlabel("Threshold")
plt.legend(loc="upper left")
plt.ylim([0, 1])
plt.figure(figsize=(10, 7))
plot_prec_recall_vs_tresh(prec, rec, tre)
plt.show()
# Setting the threshold
optimal_threshold_curve = 0.39
# Creating confusion matrix
confusion_matrix_statsmodels(lg2, X_train2, y_train, threshold = optimal_threshold_curve)
log_reg_model_train_perf_threshold_curve = model_performance_classification_statsmodels(
lg2, X_train2, y_train, threshold = optimal_threshold_curve
)
print("Training performance:")
log_reg_model_train_perf_threshold_curve
Training performance:
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.795723 | 0.721392 | 0.678628 | 0.699357 |
Dropping the columns from the test set that were dropped from the training set
X_test1 = X_test[X_train2.columns].astype(float)
Using model with default threshold
# Creating confusion matrix
confusion_matrix_statsmodels(lg2, X_test1, y_test)
# Logistic regression model of the test set
log_reg_model_test_perf = model_performance_classification_statsmodels(
lg2, X_test1, y_test
)
print("Test performance:")
log_reg_model_test_perf
Test performance:
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.803179 | 0.626917 | 0.727273 | 0.673376 |
# ROC-AUC Curve for test set
logit_roc_auc_train = roc_auc_score(y_test, lg2.predict(X_test1))
fpr, tpr, thresholds = roc_curve(y_test, lg2.predict(X_test1))
plt.figure(figsize = (7, 5))
plt.plot(fpr, tpr, label= "Logistic Regression (area = %0.2f)" % logit_roc_auc_train)
plt.plot([0, 1], [0, 1], "r--")
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel("False Positive Rate")
plt.ylabel("True Positive Rate")
plt.title("Receiver operating characteristic")
plt.legend(loc = "lower right")
plt.show()
# Creating confusion matrix with model threshold = 0.35
confusion_matrix_statsmodels(lg2, X_test1, y_test, threshold = optimal_threshold_auc_roc)
# Checking model performance for this model with optimal threhold AUC-ROC
log_reg_model_test_perf_threshold_auc_roc = model_performance_classification_statsmodels(
lg2, X_test1, y_test, threshold = optimal_threshold_auc_roc
)
print("Test performance:")
log_reg_model_test_perf_threshold_auc_roc
Test performance:
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.79252 | 0.76008 | 0.654523 | 0.703363 |
# Creating confusion matrix with model threshold = 0.39
confusion_matrix_statsmodels(lg2, X_test1, y_test, threshold = optimal_threshold_curve)
# Checking model performance for this model with optimal threshold curve
log_reg_model_test_perf_threshold_curve = model_performance_classification_statsmodels(
lg2, X_test1, y_test, threshold = optimal_threshold_curve
)
print("Test performance:")
log_reg_model_test_perf_threshold_curve
Test performance:
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.800331 | 0.727712 | 0.678581 | 0.702288 |
# Training performance comparison
models_train_comp_df = pd.concat(
[
log_reg_model_train_perf.T,
log_reg_model_train_perf_threshold_auc_roc.T,
log_reg_model_train_perf_threshold_curve.T,
],
axis=1,
)
models_train_comp_df.columns = [
"Logistic Regression statsmodel",
"Logistic Regression-0.35 Threshold",
"Logistic Regression-0.39 Threshold",
]
print("Training performance comparison:")
models_train_comp_df
Training performance comparison:
| Logistic Regression statsmodel | Logistic Regression-0.35 Threshold | Logistic Regression-0.39 Threshold | |
|---|---|---|---|
| Accuracy | 0.804624 | 0.787138 | 0.795723 |
| Recall | 0.628124 | 0.752840 | 0.721392 |
| Precision | 0.739443 | 0.653519 | 0.678628 |
| F1 | 0.679253 | 0.699672 | 0.699357 |
# Testing performance comparison
models_test_comp_df = pd.concat(
[
log_reg_model_test_perf.T,
log_reg_model_test_perf_threshold_auc_roc.T,
log_reg_model_test_perf_threshold_curve.T,
],
axis=1,
)
models_test_comp_df.columns = [
"Logistic Regression statsmodel",
"Logistic Regression-0.35 Threshold",
"Logistic Regression-0.39 Threshold",
]
print("Test set performance comparison:")
models_test_comp_df
Test set performance comparison:
| Logistic Regression statsmodel | Logistic Regression-0.35 Threshold | Logistic Regression-0.39 Threshold | |
|---|---|---|---|
| Accuracy | 0.803179 | 0.792520 | 0.800331 |
| Recall | 0.626917 | 0.760080 | 0.727712 |
| Precision | 0.727273 | 0.654523 | 0.678581 |
| F1 | 0.673376 | 0.703363 | 0.702288 |
False Positive: Predicting a customer will not cancel their booking but in reality the customer canceled the booking leading to loss of revenue for INN Hotels in the form of getting the room ready, not being able to resell the room, etc.
False Negative: Predicting a customer will cancel their booking but in reality the customer did not cancel the booking leading to loss of opportunity if INN Hotels decided to try to book the room to someone else at a lower price.
recall should be maximized, the greater the recall higher the chances of minimizing the false negatives.# Defining the decision tree model
model = DecisionTreeClassifier(criterion = "gini", class_weight = "balanced", random_state = 1)
# Fitting the decision tree
model.fit(X_train, y_train)
DecisionTreeClassifier(class_weight='balanced', random_state=1)
# Function to calculate recall score
def get_recall_score(model, predictors, target):
"""
model: classifier
predictors: independent variables
target: dependent variable
"""
prediction = model.predict(predictors)
return recall_score(target, prediction)
# Defining confusion matrix for decision tree
def confusion_matrix_sklearn(model, predictors, target):
"""
To plot the confusion_matrix with percentages
model: classifier
predictors: independent variables
target: dependent variable
"""
y_pred = model.predict(predictors)
cm = confusion_matrix(target, y_pred)
labels = np.asarray(
[
["{0:0.0f}".format(item) + "\n{0:.2%}".format(item / cm.flatten().sum())]
for item in cm.flatten()
]
).reshape(2, 2)
plt.figure(figsize=(6, 4))
sns.heatmap(cm, annot=labels, fmt="")
plt.ylabel("True label")
plt.xlabel("Predicted label")
# Defining a function to compute different metrics to check performance of a classification model built using sklearn
def model_performance_classification_sklearn(model, predictors, target):
"""
Function to compute different metrics to check classification model performance
model: classifier
predictors: independent variables
target: dependent variable
"""
# predicting using the independent variables
pred = model.predict(predictors)
acc = accuracy_score(target, pred) # to compute Accuracy
recall = recall_score(target, pred) # to compute Recall
precision = precision_score(target, pred) # to compute Precision
f1 = f1_score(target, pred) # to compute F1-score
# creating a dataframe of metrics
df_perf = pd.DataFrame(
{"Accuracy": acc, "Recall": recall, "Precision": precision, "F1": f1,},
index=[0],
)
return df_perf
# Checking decision tree performance on training set
decision_tree_perf_train = model_performance_classification_sklearn(
model, X_train, y_train
)
decision_tree_perf_train
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.993108 | 0.995097 | 0.984153 | 0.989595 |
# Checking decision tree performance on training set
confusion_matrix_sklearn(model, X_train, y_train)
# Checking decision tree performance on test set
decision_tree_perf_test = model_performance_classification_sklearn(model, X_test, y_test)
decision_tree_perf_test
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.863089 | 0.810619 | 0.776237 | 0.793056 |
# Checking decision tree performance on test set
confusion_matrix_sklearn(model, X_test, y_test)
# Creating a list of column names
feature_names = X_train.columns.to_list()
# Plotting decision tree
plt.figure(figsize=(40, 60))
out = tree.plot_tree(
model,
feature_names=feature_names,
filled=True,
fontsize=9,
node_ids=False,
class_names=None,
)
# below code will add arrows to the decision tree split if they are missing
for o in out:
arrow = o.arrow_patch
if arrow is not None:
arrow.set_edgecolor("black")
arrow.set_linewidth(1)
plt.show()
# Text report showing the rules of a decision tree
print(tree.export_text(model, feature_names=feature_names, show_weights=True))
|--- lead_time <= 151.50 | |--- no_of_special_requests <= 0.50 | | |--- market_segment_type_Online <= 0.50 | | | |--- lead_time <= 90.50 | | | | |--- no_of_days_stayed <= 5.50 | | | | | |--- avg_price_per_room <= 201.50 | | | | | | |--- lead_time <= 74.50 | | | | | | | |--- arrival_month <= 5.50 | | | | | | | | |--- arrival_date <= 27.50 | | | | | | | | | |--- lead_time <= 59.50 | | | | | | | | | | |--- market_segment_type_Offline <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 16 | | | | | | | | | | |--- market_segment_type_Offline > 0.50 | | | | | | | | | | | |--- truncated branch of depth 11 | | | | | | | | | |--- lead_time > 59.50 | | | | | | | | | | |--- arrival_date <= 16.50 | | | | | | | | | | | |--- weights: [19.38, 0.00] class: 0 | | | | | | | | | | |--- arrival_date > 16.50 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | |--- arrival_date > 27.50 | | | | | | | | | |--- avg_price_per_room <= 61.00 | | | | | | | | | | |--- avg_price_per_room <= 59.75 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- avg_price_per_room > 59.75 | | | | | | | | | | | |--- weights: [0.00, 50.10] class: 1 | | | | | | | | | |--- avg_price_per_room > 61.00 | | | | | | | | | | |--- arrival_date <= 29.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- arrival_date > 29.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | |--- arrival_month > 5.50 | | | | | | | | |--- market_segment_type_Offline <= 0.50 | | | | | | | | | |--- repeated_guest <= 0.50 | | | | | | | | | | |--- room_type_reserved_Room_Type 4 <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 18 | | | | | | | | | | |--- room_type_reserved_Room_Type 4 > 0.50 | | | | | | | | | | | |--- truncated branch of depth 8 | | | | | | | | | |--- repeated_guest > 0.50 | | | | | | | | | | |--- weights: [132.71, 0.00] class: 0 | | | | | | | | |--- market_segment_type_Offline > 0.50 | | | | | | | | | |--- avg_price_per_room <= 50.00 | | | | | | | | | | |--- arrival_month <= 9.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- arrival_month > 9.50 | | | | | | | | | | | |--- weights: [14.17, 0.00] class: 0 | | | | | | | | | |--- avg_price_per_room > 50.00 | | | | | | | | | | |--- arrival_date <= 1.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- arrival_date > 1.50 | | | | | | | | | | | |--- truncated branch of depth 14 | | | | | | |--- lead_time > 74.50 | | | | | | | |--- lead_time <= 78.50 | | | | | | | | |--- avg_price_per_room <= 79.78 | | | | | | | | | |--- arrival_month <= 3.50 | | | | | | | | | | |--- avg_price_per_room <= 69.85 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- avg_price_per_room > 69.85 | | | | | | | | | | | |--- weights: [0.00, 1.52] class: 1 | | | | | | | | | |--- arrival_month > 3.50 | | | | | | | | | | |--- weights: [12.67, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 79.78 | | | | | | | | | |--- arrival_month <= 3.50 | | | | | | | | | | |--- weights: [0.00, 28.84] class: 1 | | | | | | | | | |--- arrival_month > 3.50 | | | | | | | | | | |--- no_of_days_stayed <= 2.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- no_of_days_stayed > 2.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | |--- lead_time > 78.50 | | | | | | | | |--- no_of_days_stayed <= 3.50 | | | | | | | | | |--- market_segment_type_Corporate <= 0.50 | | | | | | | | | | |--- no_of_days_stayed <= 2.50 | | | | | | | | | | | |--- weights: [82.01, 0.00] class: 0 | | | | | | | | | | |--- no_of_days_stayed > 2.50 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | |--- market_segment_type_Corporate > 0.50 | | | | | | | | | | |--- lead_time <= 86.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- lead_time > 86.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | |--- no_of_days_stayed > 3.50 | | | | | | | | | |--- arrival_date <= 24.50 | | | | | | | | | | |--- arrival_date <= 8.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- arrival_date > 8.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | |--- arrival_date > 24.50 | | | | | | | | | | |--- arrival_date <= 25.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- arrival_date > 25.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | |--- avg_price_per_room > 201.50 | | | | | | |--- arrival_month <= 10.50 | | | | | | | |--- weights: [0.00, 25.81] class: 1 | | | | | | |--- arrival_month > 10.50 | | | | | | | |--- no_of_people <= 3.00 | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | |--- no_of_people > 3.00 | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | |--- no_of_days_stayed > 5.50 | | | | | |--- avg_price_per_room <= 92.80 | | | | | | |--- arrival_date <= 22.50 | | | | | | | |--- room_type_reserved_Room_Type 5 <= 0.50 | | | | | | | | |--- lead_time <= 72.50 | | | | | | | | | |--- lead_time <= 33.00 | | | | | | | | | | |--- arrival_date <= 16.00 | | | | | | | | | | | |--- weights: [18.64, 0.00] class: 0 | | | | | | | | | | |--- arrival_date > 16.00 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | |--- lead_time > 33.00 | | | | | | | | | | |--- arrival_month <= 6.50 | | | | | | | | | | | |--- weights: [2.24, 0.00] class: 0 | | | | | | | | | | |--- arrival_month > 6.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | |--- lead_time > 72.50 | | | | | | | | | |--- weights: [14.91, 0.00] class: 0 | | | | | | | |--- room_type_reserved_Room_Type 5 > 0.50 | | | | | | | | |--- weights: [0.00, 1.52] class: 1 | | | | | | |--- arrival_date > 22.50 | | | | | | | |--- weights: [23.86, 0.00] class: 0 | | | | | |--- avg_price_per_room > 92.80 | | | | | | |--- arrival_month <= 8.50 | | | | | | | |--- arrival_date <= 21.00 | | | | | | | | |--- no_of_people <= 1.50 | | | | | | | | | |--- arrival_month <= 4.50 | | | | | | | | | | |--- arrival_date <= 11.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- arrival_date > 11.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- arrival_month > 4.50 | | | | | | | | | | |--- weights: [0.00, 74.39] class: 1 | | | | | | | | |--- no_of_people > 1.50 | | | | | | | | | |--- no_of_days_stayed <= 6.50 | | | | | | | | | | |--- weights: [1.49, 0.00] class: 0 | | | | | | | | | |--- no_of_days_stayed > 6.50 | | | | | | | | | | |--- arrival_month <= 6.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- arrival_month > 6.50 | | | | | | | | | | | |--- weights: [0.00, 4.55] class: 1 | | | | | | | |--- arrival_date > 21.00 | | | | | | | | |--- weights: [5.22, 0.00] class: 0 | | | | | | |--- arrival_month > 8.50 | | | | | | | |--- weights: [7.46, 0.00] class: 0 | | | |--- lead_time > 90.50 | | | | |--- lead_time <= 117.50 | | | | | |--- avg_price_per_room <= 93.58 | | | | | | |--- avg_price_per_room <= 75.07 | | | | | | | |--- arrival_month <= 7.50 | | | | | | | | |--- avg_price_per_room <= 58.75 | | | | | | | | | |--- weights: [10.44, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 58.75 | | | | | | | | | |--- no_of_days_stayed <= 3.50 | | | | | | | | | | |--- lead_time <= 104.50 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | | |--- lead_time > 104.50 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | |--- no_of_days_stayed > 3.50 | | | | | | | | | | |--- arrival_date <= 23.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- arrival_date > 23.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | |--- arrival_month > 7.50 | | | | | | | | |--- arrival_date <= 29.50 | | | | | | | | | |--- type_of_meal_plan_Not Selected <= 0.50 | | | | | | | | | | |--- avg_price_per_room <= 71.12 | | | | | | | | | | | |--- weights: [32.06, 0.00] class: 0 | | | | | | | | | | |--- avg_price_per_room > 71.12 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | |--- type_of_meal_plan_Not Selected > 0.50 | | | | | | | | | | |--- arrival_date <= 6.50 | | | | | | | | | | | |--- weights: [1.49, 0.00] class: 0 | | | | | | | | | | |--- arrival_date > 6.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | |--- arrival_date > 29.50 | | | | | | | | | |--- lead_time <= 98.00 | | | | | | | | | | |--- weights: [4.47, 0.00] class: 0 | | | | | | | | | |--- lead_time > 98.00 | | | | | | | | | | |--- avg_price_per_room <= 63.25 | | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | | | |--- avg_price_per_room > 63.25 | | | | | | | | | | | |--- weights: [0.00, 13.66] class: 1 | | | | | | |--- avg_price_per_room > 75.07 | | | | | | | |--- arrival_month <= 3.50 | | | | | | | | |--- avg_price_per_room <= 88.50 | | | | | | | | | |--- no_of_days_stayed <= 1.50 | | | | | | | | | | |--- avg_price_per_room <= 80.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- avg_price_per_room > 80.50 | | | | | | | | | | | |--- weights: [17.15, 0.00] class: 0 | | | | | | | | | |--- no_of_days_stayed > 1.50 | | | | | | | | | | |--- weights: [37.28, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 88.50 | | | | | | | | | |--- avg_price_per_room <= 90.19 | | | | | | | | | | |--- weights: [0.00, 1.52] class: 1 | | | | | | | | | |--- avg_price_per_room > 90.19 | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | |--- arrival_month > 3.50 | | | | | | | | |--- arrival_month <= 4.50 | | | | | | | | | |--- lead_time <= 102.50 | | | | | | | | | | |--- weights: [0.00, 16.70] class: 1 | | | | | | | | | |--- lead_time > 102.50 | | | | | | | | | | |--- no_of_days_stayed <= 4.50 | | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | | | |--- no_of_days_stayed > 4.50 | | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | |--- arrival_month > 4.50 | | | | | | | | | |--- no_of_people <= 1.50 | | | | | | | | | | |--- avg_price_per_room <= 86.00 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- avg_price_per_room > 86.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- no_of_people > 1.50 | | | | | | | | | | |--- arrival_date <= 22.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- arrival_date > 22.50 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | |--- avg_price_per_room > 93.58 | | | | | | |--- arrival_date <= 11.50 | | | | | | | |--- arrival_month <= 7.50 | | | | | | | | |--- lead_time <= 108.50 | | | | | | | | | |--- no_of_people <= 1.50 | | | | | | | | | | |--- weights: [2.98, 0.00] class: 0 | | | | | | | | | |--- no_of_people > 1.50 | | | | | | | | | | |--- lead_time <= 102.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- lead_time > 102.00 | | | | | | | | | | | |--- weights: [0.00, 4.55] class: 1 | | | | | | | | |--- lead_time > 108.50 | | | | | | | | | |--- no_of_days_stayed <= 2.50 | | | | | | | | | | |--- weights: [8.95, 1.52] class: 0 | | | | | | | | | |--- no_of_days_stayed > 2.50 | | | | | | | | | | |--- weights: [3.73, 0.00] class: 0 | | | | | | | |--- arrival_month > 7.50 | | | | | | | | |--- lead_time <= 110.50 | | | | | | | | | |--- avg_price_per_room <= 116.75 | | | | | | | | | | |--- weights: [2.98, 0.00] class: 0 | | | | | | | | | |--- avg_price_per_room > 116.75 | | | | | | | | | | |--- weights: [1.49, 1.52] class: 1 | | | | | | | | |--- lead_time > 110.50 | | | | | | | | | |--- no_of_days_stayed <= 2.00 | | | | | | | | | | |--- weights: [0.00, 12.14] class: 1 | | | | | | | | | |--- no_of_days_stayed > 2.00 | | | | | | | | | | |--- arrival_date <= 7.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- arrival_date > 7.50 | | | | | | | | | | | |--- weights: [1.49, 0.00] class: 0 | | | | | | |--- arrival_date > 11.50 | | | | | | | |--- avg_price_per_room <= 102.09 | | | | | | | | |--- arrival_date <= 14.50 | | | | | | | | | |--- weights: [1.49, 0.00] class: 0 | | | | | | | | |--- arrival_date > 14.50 | | | | | | | | | |--- arrival_month <= 2.50 | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | | |--- arrival_month > 2.50 | | | | | | | | | | |--- avg_price_per_room <= 95.44 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- avg_price_per_room > 95.44 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | |--- avg_price_per_room > 102.09 | | | | | | | | |--- avg_price_per_room <= 109.50 | | | | | | | | | |--- no_of_days_stayed <= 1.50 | | | | | | | | | | |--- lead_time <= 101.00 | | | | | | | | | | | |--- weights: [0.00, 16.70] class: 1 | | | | | | | | | | |--- lead_time > 101.00 | | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | | |--- no_of_days_stayed > 1.50 | | | | | | | | | | |--- weights: [33.55, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 109.50 | | | | | | | | | |--- avg_price_per_room <= 124.25 | | | | | | | | | | |--- arrival_date <= 19.50 | | | | | | | | | | | |--- weights: [0.00, 71.35] class: 1 | | | | | | | | | | |--- arrival_date > 19.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- avg_price_per_room > 124.25 | | | | | | | | | | |--- arrival_date <= 27.50 | | | | | | | | | | | |--- weights: [2.24, 0.00] class: 0 | | | | | | | | | | |--- arrival_date > 27.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | |--- lead_time > 117.50 | | | | | |--- no_of_people <= 1.50 | | | | | | |--- avg_price_per_room <= 122.00 | | | | | | | |--- weights: [104.38, 0.00] class: 0 | | | | | | |--- avg_price_per_room > 122.00 | | | | | | | |--- weights: [0.00, 3.04] class: 1 | | | | | |--- no_of_people > 1.50 | | | | | | |--- arrival_month <= 11.50 | | | | | | | |--- arrival_date <= 7.50 | | | | | | | | |--- lead_time <= 150.50 | | | | | | | | | |--- arrival_month <= 5.00 | | | | | | | | | | |--- weights: [24.60, 0.00] class: 0 | | | | | | | | | |--- arrival_month > 5.00 | | | | | | | | | | |--- arrival_month <= 6.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- arrival_month > 6.50 | | | | | | | | | | | |--- weights: [14.91, 0.00] class: 0 | | | | | | | | |--- lead_time > 150.50 | | | | | | | | | |--- weights: [0.00, 1.52] class: 1 | | | | | | | |--- arrival_date > 7.50 | | | | | | | | |--- arrival_date <= 24.50 | | | | | | | | | |--- no_of_days_stayed <= 3.50 | | | | | | | | | | |--- arrival_date <= 23.50 | | | | | | | | | | | |--- truncated branch of depth 9 | | | | | | | | | | |--- arrival_date > 23.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- no_of_days_stayed > 3.50 | | | | | | | | | | |--- avg_price_per_room <= 74.12 | | | | | | | | | | | |--- truncated branch of depth 7 | | | | | | | | | | |--- avg_price_per_room > 74.12 | | | | | | | | | | | |--- weights: [20.13, 0.00] class: 0 | | | | | | | | |--- arrival_date > 24.50 | | | | | | | | | |--- avg_price_per_room <= 173.26 | | | | | | | | | | |--- avg_price_per_room <= 57.25 | | | | | | | | | | | |--- weights: [0.00, 1.52] class: 1 | | | | | | | | | | |--- avg_price_per_room > 57.25 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | |--- avg_price_per_room > 173.26 | | | | | | | | | | |--- weights: [0.00, 4.55] class: 1 | | | | | | |--- arrival_month > 11.50 | | | | | | | |--- weights: [48.46, 0.00] class: 0 | | |--- market_segment_type_Online > 0.50 | | | |--- lead_time <= 13.50 | | | | |--- avg_price_per_room <= 99.44 | | | | | |--- arrival_month <= 1.50 | | | | | | |--- weights: [92.45, 0.00] class: 0 | | | | | |--- arrival_month > 1.50 | | | | | | |--- arrival_month <= 8.50 | | | | | | | |--- no_of_days_stayed <= 2.50 | | | | | | | | |--- lead_time <= 5.50 | | | | | | | | | |--- no_of_people <= 1.50 | | | | | | | | | | |--- arrival_month <= 2.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- arrival_month > 2.50 | | | | | | | | | | | |--- weights: [27.59, 0.00] class: 0 | | | | | | | | | |--- no_of_people > 1.50 | | | | | | | | | | |--- avg_price_per_room <= 74.40 | | | | | | | | | | | |--- weights: [18.64, 0.00] class: 0 | | | | | | | | | | |--- avg_price_per_room > 74.40 | | | | | | | | | | | |--- truncated branch of depth 14 | | | | | | | | |--- lead_time > 5.50 | | | | | | | | | |--- arrival_date <= 3.50 | | | | | | | | | | |--- weights: [7.46, 0.00] class: 0 | | | | | | | | | |--- arrival_date > 3.50 | | | | | | | | | | |--- avg_price_per_room <= 68.38 | | | | | | | | | | | |--- weights: [4.47, 0.00] class: 0 | | | | | | | | | | |--- avg_price_per_room > 68.38 | | | | | | | | | | | |--- truncated branch of depth 10 | | | | | | | |--- no_of_days_stayed > 2.50 | | | | | | | | |--- no_of_people <= 1.50 | | | | | | | | | |--- avg_price_per_room <= 85.50 | | | | | | | | | | |--- weights: [0.00, 22.77] class: 1 | | | | | | | | | |--- avg_price_per_room > 85.50 | | | | | | | | | | |--- required_car_parking_space <= 0.50 | | | | | | | | | | | |--- weights: [2.24, 0.00] class: 0 | | | | | | | | | | |--- required_car_parking_space > 0.50 | | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | |--- no_of_people > 1.50 | | | | | | | | | |--- lead_time <= 2.50 | | | | | | | | | | |--- arrival_date <= 20.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- arrival_date > 20.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | |--- lead_time > 2.50 | | | | | | | | | | |--- arrival_date <= 13.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- arrival_date > 13.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | |--- arrival_month > 8.50 | | | | | | | |--- no_of_days_stayed <= 5.50 | | | | | | | | |--- avg_price_per_room <= 94.66 | | | | | | | | | |--- arrival_date <= 1.50 | | | | | | | | | | |--- lead_time <= 11.00 | | | | | | | | | | | |--- weights: [5.96, 0.00] class: 0 | | | | | | | | | | |--- lead_time > 11.00 | | | | | | | | | | | |--- weights: [0.00, 1.52] class: 1 | | | | | | | | | |--- arrival_date > 1.50 | | | | | | | | | | |--- avg_price_per_room <= 90.17 | | | | | | | | | | | |--- weights: [116.31, 0.00] class: 0 | | | | | | | | | | |--- avg_price_per_room > 90.17 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | |--- avg_price_per_room > 94.66 | | | | | | | | | |--- avg_price_per_room <= 95.10 | | | | | | | | | | |--- no_of_people <= 1.50 | | | | | | | | | | | |--- weights: [2.98, 0.00] class: 0 | | | | | | | | | | |--- no_of_people > 1.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | |--- avg_price_per_room > 95.10 | | | | | | | | | | |--- weights: [14.91, 0.00] class: 0 | | | | | | | |--- no_of_days_stayed > 5.50 | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | |--- lead_time <= 3.50 | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | | |--- lead_time > 3.50 | | | | | | | | | | |--- weights: [0.00, 9.11] class: 1 | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | |--- avg_price_per_room <= 63.32 | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | | |--- avg_price_per_room > 63.32 | | | | | | | | | | |--- weights: [5.22, 0.00] class: 0 | | | | |--- avg_price_per_room > 99.44 | | | | | |--- lead_time <= 3.50 | | | | | | |--- avg_price_per_room <= 202.67 | | | | | | | |--- no_of_days_stayed <= 6.50 | | | | | | | | |--- arrival_month <= 5.50 | | | | | | | | | |--- type_of_meal_plan_Not Selected <= 0.50 | | | | | | | | | | |--- avg_price_per_room <= 163.00 | | | | | | | | | | | |--- truncated branch of depth 8 | | | | | | | | | | |--- avg_price_per_room > 163.00 | | | | | | | | | | | |--- weights: [8.95, 0.00] class: 0 | | | | | | | | | |--- type_of_meal_plan_Not Selected > 0.50 | | | | | | | | | | |--- weights: [10.44, 0.00] class: 0 | | | | | | | | |--- arrival_month > 5.50 | | | | | | | | | |--- arrival_date <= 20.50 | | | | | | | | | | |--- avg_price_per_room <= 132.39 | | | | | | | | | | | |--- weights: [60.39, 0.00] class: 0 | | | | | | | | | | |--- avg_price_per_room > 132.39 | | | | | | | | | | | |--- truncated branch of depth 7 | | | | | | | | | |--- arrival_date > 20.50 | | | | | | | | | | |--- arrival_date <= 24.50 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | | |--- arrival_date > 24.50 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | |--- no_of_days_stayed > 6.50 | | | | | | | | |--- weights: [0.00, 6.07] class: 1 | | | | | | |--- avg_price_per_room > 202.67 | | | | | | | |--- arrival_month <= 11.00 | | | | | | | | |--- weights: [0.00, 22.77] class: 1 | | | | | | | |--- arrival_month > 11.00 | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | |--- lead_time > 3.50 | | | | | | |--- arrival_month <= 8.50 | | | | | | | |--- avg_price_per_room <= 119.25 | | | | | | | | |--- avg_price_per_room <= 118.50 | | | | | | | | | |--- lead_time <= 12.50 | | | | | | | | | | |--- no_of_people <= 1.50 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | | |--- no_of_people > 1.50 | | | | | | | | | | | |--- truncated branch of depth 11 | | | | | | | | | |--- lead_time > 12.50 | | | | | | | | | | |--- type_of_meal_plan_Not Selected <= 0.50 | | | | | | | | | | | |--- weights: [2.98, 0.00] class: 0 | | | | | | | | | | |--- type_of_meal_plan_Not Selected > 0.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | |--- avg_price_per_room > 118.50 | | | | | | | | | |--- no_of_people <= 2.50 | | | | | | | | | | |--- weights: [7.46, 0.00] class: 0 | | | | | | | | | |--- no_of_people > 2.50 | | | | | | | | | | |--- lead_time <= 4.50 | | | | | | | | | | | |--- weights: [0.00, 1.52] class: 1 | | | | | | | | | | |--- lead_time > 4.50 | | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | |--- avg_price_per_room > 119.25 | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | |--- lead_time <= 6.50 | | | | | | | | | | |--- weights: [0.00, 1.52] class: 1 | | | | | | | | | |--- lead_time > 6.50 | | | | | | | | | | |--- weights: [2.98, 0.00] class: 0 | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | |--- arrival_month <= 1.50 | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | | |--- arrival_month > 1.50 | | | | | | | | | | |--- room_type_reserved_Room_Type 5 <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 16 | | | | | | | | | | |--- room_type_reserved_Room_Type 5 > 0.50 | | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | |--- arrival_month > 8.50 | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | |--- no_of_days_stayed <= 1.50 | | | | | | | | | |--- lead_time <= 9.00 | | | | | | | | | | |--- arrival_month <= 10.50 | | | | | | | | | | | |--- weights: [1.49, 0.00] class: 0 | | | | | | | | | | |--- arrival_month > 10.50 | | | | | | | | | | | |--- weights: [2.24, 0.00] class: 0 | | | | | | | | | |--- lead_time > 9.00 | | | | | | | | | | |--- room_type_reserved_Room_Type 4 <= 0.50 | | | | | | | | | | | |--- weights: [0.00, 1.52] class: 1 | | | | | | | | | | |--- room_type_reserved_Room_Type 4 > 0.50 | | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | |--- no_of_days_stayed > 1.50 | | | | | | | | | |--- weights: [21.62, 0.00] class: 0 | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | |--- arrival_date <= 14.00 | | | | | | | | | | |--- arrival_date <= 1.50 | | | | | | | | | | | |--- weights: [2.24, 0.00] class: 0 | | | | | | | | | | |--- arrival_date > 1.50 | | | | | | | | | | | |--- truncated branch of depth 9 | | | | | | | | | |--- arrival_date > 14.00 | | | | | | | | | | |--- avg_price_per_room <= 208.67 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- avg_price_per_room > 208.67 | | | | | | | | | | | |--- weights: [0.00, 4.55] class: 1 | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | |--- weights: [15.66, 0.00] class: 0 | | | |--- lead_time > 13.50 | | | | |--- required_car_parking_space <= 0.50 | | | | | |--- avg_price_per_room <= 71.92 | | | | | | |--- avg_price_per_room <= 59.43 | | | | | | | |--- lead_time <= 84.50 | | | | | | | | |--- arrival_date <= 17.50 | | | | | | | | | |--- lead_time <= 51.50 | | | | | | | | | | |--- avg_price_per_room <= 21.67 | | | | | | | | | | | |--- weights: [6.71, 0.00] class: 0 | | | | | | | | | | |--- avg_price_per_room > 21.67 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | |--- lead_time > 51.50 | | | | | | | | | | |--- weights: [12.67, 0.00] class: 0 | | | | | | | | |--- arrival_date > 17.50 | | | | | | | | | |--- weights: [23.11, 0.00] class: 0 | | | | | | | |--- lead_time > 84.50 | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | |--- arrival_date <= 27.00 | | | | | | | | | | |--- lead_time <= 131.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- lead_time > 131.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- arrival_date > 27.00 | | | | | | | | | | |--- lead_time <= 92.50 | | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | | | |--- lead_time > 92.50 | | | | | | | | | | | |--- weights: [2.98, 0.00] class: 0 | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | |--- weights: [10.44, 0.00] class: 0 | | | | | | |--- avg_price_per_room > 59.43 | | | | | | | |--- lead_time <= 25.50 | | | | | | | | |--- no_of_days_stayed <= 4.50 | | | | | | | | | |--- no_of_days_stayed <= 1.50 | | | | | | | | | | |--- arrival_date <= 12.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- arrival_date > 12.50 | | | | | | | | | | | |--- weights: [1.49, 0.00] class: 0 | | | | | | | | | |--- no_of_days_stayed > 1.50 | | | | | | | | | | |--- weights: [14.91, 0.00] class: 0 | | | | | | | | |--- no_of_days_stayed > 4.50 | | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | | |--- arrival_date <= 4.00 | | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | | | |--- arrival_date > 4.00 | | | | | | | | | | | |--- weights: [0.00, 4.55] class: 1 | | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | | |--- no_of_previous_cancellations <= 0.50 | | | | | | | | | | | |--- weights: [1.49, 0.00] class: 0 | | | | | | | | | | |--- no_of_previous_cancellations > 0.50 | | | | | | | | | | | |--- weights: [1.49, 0.00] class: 0 | | | | | | | |--- lead_time > 25.50 | | | | | | | | |--- avg_price_per_room <= 71.34 | | | | | | | | | |--- arrival_month <= 3.50 | | | | | | | | | | |--- lead_time <= 68.50 | | | | | | | | | | | |--- truncated branch of depth 9 | | | | | | | | | | |--- lead_time > 68.50 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | |--- arrival_month > 3.50 | | | | | | | | | | |--- lead_time <= 102.00 | | | | | | | | | | | |--- truncated branch of depth 7 | | | | | | | | | | |--- lead_time > 102.00 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | |--- avg_price_per_room > 71.34 | | | | | | | | | |--- weights: [11.18, 0.00] class: 0 | | | | | |--- avg_price_per_room > 71.92 | | | | | | |--- arrival_year <= 2017.50 | | | | | | | |--- lead_time <= 65.50 | | | | | | | | |--- avg_price_per_room <= 120.45 | | | | | | | | | |--- arrival_month <= 7.50 | | | | | | | | | | |--- weights: [0.00, 1.52] class: 1 | | | | | | | | | |--- arrival_month > 7.50 | | | | | | | | | | |--- type_of_meal_plan_Not Selected <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | | |--- type_of_meal_plan_Not Selected > 0.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | |--- avg_price_per_room > 120.45 | | | | | | | | | |--- room_type_reserved_Room_Type 6 <= 0.50 | | | | | | | | | | |--- no_of_days_stayed <= 2.50 | | | | | | | | | | | |--- weights: [1.49, 0.00] class: 0 | | | | | | | | | | |--- no_of_days_stayed > 2.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | |--- room_type_reserved_Room_Type 6 > 0.50 | | | | | | | | | | |--- weights: [2.98, 0.00] class: 0 | | | | | | | |--- lead_time > 65.50 | | | | | | | | |--- type_of_meal_plan_Meal Plan 2 <= 0.50 | | | | | | | | | |--- arrival_date <= 27.50 | | | | | | | | | | |--- avg_price_per_room <= 75.75 | | | | | | | | | | | |--- weights: [0.00, 10.63] class: 1 | | | | | | | | | | |--- avg_price_per_room > 75.75 | | | | | | | | | | | |--- truncated branch of depth 11 | | | | | | | | | |--- arrival_date > 27.50 | | | | | | | | | | |--- weights: [3.73, 0.00] class: 0 | | | | | | | | |--- type_of_meal_plan_Meal Plan 2 > 0.50 | | | | | | | | | |--- room_type_reserved_Room_Type 5 <= 0.50 | | | | | | | | | | |--- weights: [0.00, 60.72] class: 1 | | | | | | | | | |--- room_type_reserved_Room_Type 5 > 0.50 | | | | | | | | | | |--- weights: [0.00, 3.04] class: 1 | | | | | | |--- arrival_year > 2017.50 | | | | | | | |--- avg_price_per_room <= 104.31 | | | | | | | | |--- lead_time <= 25.50 | | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | | |--- arrival_month <= 1.50 | | | | | | | | | | | |--- weights: [16.40, 0.00] class: 0 | | | | | | | | | | |--- arrival_month > 1.50 | | | | | | | | | | | |--- truncated branch of depth 16 | | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | | |--- weights: [23.11, 0.00] class: 0 | | | | | | | | |--- lead_time > 25.50 | | | | | | | | | |--- type_of_meal_plan_Not Selected <= 0.50 | | | | | | | | | | |--- arrival_month <= 5.50 | | | | | | | | | | | |--- truncated branch of depth 20 | | | | | | | | | | |--- arrival_month > 5.50 | | | | | | | | | | | |--- truncated branch of depth 13 | | | | | | | | | |--- type_of_meal_plan_Not Selected > 0.50 | | | | | | | | | | |--- no_of_people <= 1.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- no_of_people > 1.50 | | | | | | | | | | | |--- truncated branch of depth 19 | | | | | | | |--- avg_price_per_room > 104.31 | | | | | | | | |--- arrival_month <= 10.50 | | | | | | | | | |--- room_type_reserved_Room_Type 5 <= 0.50 | | | | | | | | | | |--- avg_price_per_room <= 195.30 | | | | | | | | | | | |--- truncated branch of depth 27 | | | | | | | | | | |--- avg_price_per_room > 195.30 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | |--- room_type_reserved_Room_Type 5 > 0.50 | | | | | | | | | | |--- arrival_date <= 22.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- arrival_date > 22.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | |--- arrival_month > 10.50 | | | | | | | | | |--- no_of_people <= 2.50 | | | | | | | | | | |--- avg_price_per_room <= 162.82 | | | | | | | | | | | |--- truncated branch of depth 12 | | | | | | | | | | |--- avg_price_per_room > 162.82 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | |--- no_of_people > 2.50 | | | | | | | | | | |--- lead_time <= 22.00 | | | | | | | | | | | |--- weights: [2.24, 0.00] class: 0 | | | | | | | | | | |--- lead_time > 22.00 | | | | | | | | | | | |--- truncated branch of depth 7 | | | | |--- required_car_parking_space > 0.50 | | | | | |--- no_of_days_stayed <= 11.00 | | | | | | |--- weights: [48.46, 0.00] class: 0 | | | | | |--- no_of_days_stayed > 11.00 | | | | | | |--- weights: [0.00, 1.52] class: 1 | |--- no_of_special_requests > 0.50 | | |--- no_of_special_requests <= 1.50 | | | |--- market_segment_type_Online <= 0.50 | | | | |--- lead_time <= 102.50 | | | | | |--- type_of_meal_plan_Not Selected <= 0.50 | | | | | | |--- no_of_days_stayed <= 15.00 | | | | | | | |--- room_type_reserved_Room_Type 5 <= 0.50 | | | | | | | | |--- lead_time <= 91.50 | | | | | | | | | |--- avg_price_per_room <= 129.50 | | | | | | | | | | |--- weights: [632.23, 0.00] class: 0 | | | | | | | | | |--- avg_price_per_room > 129.50 | | | | | | | | | | |--- avg_price_per_room <= 131.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- avg_price_per_room > 131.50 | | | | | | | | | | | |--- weights: [20.13, 0.00] class: 0 | | | | | | | | |--- lead_time > 91.50 | | | | | | | | | |--- no_of_people <= 2.50 | | | | | | | | | | |--- weights: [30.57, 0.00] class: 0 | | | | | | | | | |--- no_of_people > 2.50 | | | | | | | | | | |--- arrival_date <= 9.00 | | | | | | | | | | | |--- weights: [0.00, 3.04] class: 1 | | | | | | | | | | |--- arrival_date > 9.00 | | | | | | | | | | | |--- weights: [2.98, 0.00] class: 0 | | | | | | | |--- room_type_reserved_Room_Type 5 > 0.50 | | | | | | | | |--- no_of_days_stayed <= 4.50 | | | | | | | | | |--- arrival_month <= 4.00 | | | | | | | | | | |--- weights: [1.49, 0.00] class: 0 | | | | | | | | | |--- arrival_month > 4.00 | | | | | | | | | | |--- weights: [7.46, 0.00] class: 0 | | | | | | | | |--- no_of_days_stayed > 4.50 | | | | | | | | | |--- market_segment_type_Corporate <= 0.50 | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | | |--- market_segment_type_Corporate > 0.50 | | | | | | | | | | |--- weights: [0.00, 3.04] class: 1 | | | | | | |--- no_of_days_stayed > 15.00 | | | | | | | |--- weights: [0.00, 1.52] class: 1 | | | | | |--- type_of_meal_plan_Not Selected > 0.50 | | | | | | |--- lead_time <= 63.00 | | | | | | | |--- market_segment_type_Corporate <= 0.50 | | | | | | | | |--- weights: [13.42, 0.00] class: 0 | | | | | | | |--- market_segment_type_Corporate > 0.50 | | | | | | | | |--- arrival_date <= 14.50 | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | |--- arrival_date > 14.50 | | | | | | | | | |--- weights: [1.49, 1.52] class: 1 | | | | | | |--- lead_time > 63.00 | | | | | | | |--- weights: [0.00, 7.59] class: 1 | | | | |--- lead_time > 102.50 | | | | | |--- lead_time <= 104.50 | | | | | | |--- lead_time <= 103.50 | | | | | | | |--- avg_price_per_room <= 95.17 | | | | | | | | |--- no_of_people <= 2.50 | | | | | | | | | |--- weights: [1.49, 0.00] class: 0 | | | | | | | | |--- no_of_people > 2.50 | | | | | | | | | |--- weights: [0.00, 1.52] class: 1 | | | | | | | |--- avg_price_per_room > 95.17 | | | | | | | | |--- type_of_meal_plan_Meal Plan 2 <= 0.50 | | | | | | | | | |--- weights: [1.49, 0.00] class: 0 | | | | | | | | |--- type_of_meal_plan_Meal Plan 2 > 0.50 | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | |--- lead_time > 103.50 | | | | | | | |--- weights: [0.00, 4.55] class: 1 | | | | | |--- lead_time > 104.50 | | | | | | |--- lead_time <= 150.50 | | | | | | | |--- avg_price_per_room <= 141.75 | | | | | | | | |--- no_of_days_stayed <= 3.50 | | | | | | | | | |--- avg_price_per_room <= 81.00 | | | | | | | | | | |--- arrival_month <= 3.50 | | | | | | | | | | | |--- weights: [5.22, 0.00] class: 0 | | | | | | | | | | |--- arrival_month > 3.50 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | |--- avg_price_per_room > 81.00 | | | | | | | | | | |--- avg_price_per_room <= 102.70 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- avg_price_per_room > 102.70 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | |--- no_of_days_stayed > 3.50 | | | | | | | | | |--- weights: [20.13, 0.00] class: 0 | | | | | | | |--- avg_price_per_room > 141.75 | | | | | | | | |--- arrival_date <= 13.00 | | | | | | | | | |--- weights: [0.00, 3.04] class: 1 | | | | | | | | |--- arrival_date > 13.00 | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | |--- lead_time > 150.50 | | | | | | | |--- no_of_days_stayed <= 2.50 | | | | | | | | |--- weights: [0.00, 3.04] class: 1 | | | | | | | |--- no_of_days_stayed > 2.50 | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | |--- market_segment_type_Online > 0.50 | | | | |--- lead_time <= 8.50 | | | | | |--- lead_time <= 4.50 | | | | | | |--- no_of_days_stayed <= 14.00 | | | | | | | |--- avg_price_per_room <= 219.86 | | | | | | | | |--- no_of_days_stayed <= 6.50 | | | | | | | | | |--- avg_price_per_room <= 157.64 | | | | | | | | | | |--- room_type_reserved_Room_Type 2 <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 12 | | | | | | | | | | |--- room_type_reserved_Room_Type 2 > 0.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | |--- avg_price_per_room > 157.64 | | | | | | | | | | |--- avg_price_per_room <= 158.50 | | | | | | | | | | | |--- weights: [0.00, 1.52] class: 1 | | | | | | | | | | |--- avg_price_per_room > 158.50 | | | | | | | | | | | |--- truncated branch of depth 8 | | | | | | | | |--- no_of_days_stayed > 6.50 | | | | | | | | | |--- room_type_reserved_Room_Type 4 <= 0.50 | | | | | | | | | | |--- arrival_date <= 5.50 | | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | | | |--- arrival_date > 5.50 | | | | | | | | | | | |--- weights: [5.96, 0.00] class: 0 | | | | | | | | | |--- room_type_reserved_Room_Type 4 > 0.50 | | | | | | | | | | |--- arrival_date <= 6.50 | | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | | | |--- arrival_date > 6.50 | | | | | | | | | | | |--- weights: [0.00, 3.04] class: 1 | | | | | | | |--- avg_price_per_room > 219.86 | | | | | | | | |--- arrival_date <= 11.50 | | | | | | | | | |--- weights: [3.73, 0.00] class: 0 | | | | | | | | |--- arrival_date > 11.50 | | | | | | | | | |--- arrival_date <= 14.50 | | | | | | | | | | |--- weights: [0.00, 3.04] class: 1 | | | | | | | | | |--- arrival_date > 14.50 | | | | | | | | | | |--- room_type_reserved_Room_Type 4 <= 0.50 | | | | | | | | | | | |--- weights: [2.98, 0.00] class: 0 | | | | | | | | | | |--- room_type_reserved_Room_Type 4 > 0.50 | | | | | | | | | | | |--- weights: [0.00, 1.52] class: 1 | | | | | | |--- no_of_days_stayed > 14.00 | | | | | | | |--- weights: [0.00, 3.04] class: 1 | | | | | |--- lead_time > 4.50 | | | | | | |--- arrival_date <= 13.50 | | | | | | | |--- arrival_month <= 9.50 | | | | | | | | |--- room_type_reserved_Room_Type 2 <= 0.50 | | | | | | | | | |--- avg_price_per_room <= 88.39 | | | | | | | | | | |--- arrival_date <= 3.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- arrival_date > 3.50 | | | | | | | | | | | |--- weights: [11.93, 0.00] class: 0 | | | | | | | | | |--- avg_price_per_room > 88.39 | | | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | | | |--- truncated branch of depth 8 | | | | | | | | |--- room_type_reserved_Room_Type 2 > 0.50 | | | | | | | | | |--- avg_price_per_room <= 94.48 | | | | | | | | | | |--- lead_time <= 7.50 | | | | | | | | | | | |--- weights: [0.00, 1.52] class: 1 | | | | | | | | | | |--- lead_time > 7.50 | | | | | | | | | | | |--- weights: [0.00, 3.04] class: 1 | | | | | | | | | |--- avg_price_per_room > 94.48 | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | |--- arrival_month > 9.50 | | | | | | | | |--- avg_price_per_room <= 157.12 | | | | | | | | | |--- weights: [32.06, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 157.12 | | | | | | | | | |--- lead_time <= 6.00 | | | | | | | | | | |--- weights: [0.00, 1.52] class: 1 | | | | | | | | | |--- lead_time > 6.00 | | | | | | | | | | |--- weights: [1.49, 0.00] class: 0 | | | | | | |--- arrival_date > 13.50 | | | | | | | |--- type_of_meal_plan_Not Selected <= 0.50 | | | | | | | | |--- avg_price_per_room <= 139.57 | | | | | | | | | |--- avg_price_per_room <= 101.59 | | | | | | | | | | |--- avg_price_per_room <= 101.22 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- avg_price_per_room > 101.22 | | | | | | | | | | | |--- weights: [0.00, 1.52] class: 1 | | | | | | | | | |--- avg_price_per_room > 101.59 | | | | | | | | | | |--- weights: [57.41, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 139.57 | | | | | | | | | |--- arrival_date <= 15.50 | | | | | | | | | | |--- arrival_date <= 14.50 | | | | | | | | | | | |--- weights: [1.49, 0.00] class: 0 | | | | | | | | | | |--- arrival_date > 14.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- arrival_date > 15.50 | | | | | | | | | | |--- avg_price_per_room <= 140.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- avg_price_per_room > 140.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | |--- type_of_meal_plan_Not Selected > 0.50 | | | | | | | | |--- avg_price_per_room <= 126.33 | | | | | | | | | |--- arrival_month <= 6.50 | | | | | | | | | | |--- arrival_month <= 3.50 | | | | | | | | | | | |--- weights: [8.20, 0.00] class: 0 | | | | | | | | | | |--- arrival_month > 3.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | |--- arrival_month > 6.50 | | | | | | | | | | |--- weights: [17.89, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 126.33 | | | | | | | | | |--- no_of_days_stayed <= 1.50 | | | | | | | | | | |--- arrival_date <= 16.50 | | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | | | |--- arrival_date > 16.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- no_of_days_stayed > 1.50 | | | | | | | | | | |--- arrival_month <= 9.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- arrival_month > 9.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | |--- lead_time > 8.50 | | | | | |--- required_car_parking_space <= 0.50 | | | | | | |--- avg_price_per_room <= 118.55 | | | | | | | |--- lead_time <= 61.50 | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | |--- no_of_days_stayed <= 6.50 | | | | | | | | | | |--- arrival_month <= 1.50 | | | | | | | | | | | |--- weights: [65.61, 0.00] class: 0 | | | | | | | | | | |--- arrival_month > 1.50 | | | | | | | | | | | |--- truncated branch of depth 20 | | | | | | | | | |--- no_of_days_stayed > 6.50 | | | | | | | | | | |--- arrival_month <= 1.50 | | | | | | | | | | | |--- weights: [4.47, 0.00] class: 0 | | | | | | | | | | |--- arrival_month > 1.50 | | | | | | | | | | | |--- truncated branch of depth 11 | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | |--- no_of_days_stayed <= 12.50 | | | | | | | | | | |--- weights: [126.74, 0.00] class: 0 | | | | | | | | | |--- no_of_days_stayed > 12.50 | | | | | | | | | | |--- weights: [0.00, 1.52] class: 1 | | | | | | | |--- lead_time > 61.50 | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | |--- arrival_month <= 7.50 | | | | | | | | | | |--- no_of_people <= 2.50 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | | |--- no_of_people > 2.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- arrival_month > 7.50 | | | | | | | | | | |--- lead_time <= 66.50 | | | | | | | | | | | |--- weights: [5.22, 0.00] class: 0 | | | | | | | | | | |--- lead_time > 66.50 | | | | | | | | | | | |--- truncated branch of depth 9 | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | |--- arrival_month <= 9.50 | | | | | | | | | | |--- avg_price_per_room <= 71.93 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- avg_price_per_room > 71.93 | | | | | | | | | | | |--- truncated branch of depth 16 | | | | | | | | | |--- arrival_month > 9.50 | | | | | | | | | | |--- no_of_days_stayed <= 2.50 | | | | | | | | | | | |--- truncated branch of depth 12 | | | | | | | | | | |--- no_of_days_stayed > 2.50 | | | | | | | | | | | |--- truncated branch of depth 18 | | | | | | |--- avg_price_per_room > 118.55 | | | | | | | |--- arrival_month <= 8.50 | | | | | | | | |--- arrival_date <= 19.50 | | | | | | | | | |--- no_of_people <= 2.50 | | | | | | | | | | |--- lead_time <= 146.50 | | | | | | | | | | | |--- truncated branch of depth 15 | | | | | | | | | | |--- lead_time > 146.50 | | | | | | | | | | | |--- weights: [0.00, 4.55] class: 1 | | | | | | | | | |--- no_of_people > 2.50 | | | | | | | | | | |--- no_of_days_stayed <= 2.50 | | | | | | | | | | | |--- truncated branch of depth 8 | | | | | | | | | | |--- no_of_days_stayed > 2.50 | | | | | | | | | | | |--- truncated branch of depth 14 | | | | | | | | |--- arrival_date > 19.50 | | | | | | | | | |--- arrival_date <= 27.50 | | | | | | | | | | |--- avg_price_per_room <= 121.20 | | | | | | | | | | | |--- truncated branch of depth 8 | | | | | | | | | | |--- avg_price_per_room > 121.20 | | | | | | | | | | | |--- truncated branch of depth 12 | | | | | | | | | |--- arrival_date > 27.50 | | | | | | | | | | |--- lead_time <= 55.50 | | | | | | | | | | | |--- truncated branch of depth 10 | | | | | | | | | | |--- lead_time > 55.50 | | | | | | | | | | | |--- truncated branch of depth 8 | | | | | | | |--- arrival_month > 8.50 | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | |--- arrival_month <= 9.50 | | | | | | | | | | |--- lead_time <= 14.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- lead_time > 14.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | |--- arrival_month > 9.50 | | | | | | | | | | |--- weights: [37.28, 0.00] class: 0 | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | | |--- avg_price_per_room <= 119.20 | | | | | | | | | | | |--- truncated branch of depth 7 | | | | | | | | | | |--- avg_price_per_room > 119.20 | | | | | | | | | | | |--- truncated branch of depth 23 | | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | | |--- lead_time <= 100.00 | | | | | | | | | | | |--- weights: [49.95, 0.00] class: 0 | | | | | | | | | | |--- lead_time > 100.00 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | |--- required_car_parking_space > 0.50 | | | | | | |--- room_type_reserved_Room_Type 7 <= 0.50 | | | | | | | |--- weights: [134.20, 0.00] class: 0 | | | | | | |--- room_type_reserved_Room_Type 7 > 0.50 | | | | | | | |--- weights: [0.00, 1.52] class: 1 | | |--- no_of_special_requests > 1.50 | | | |--- lead_time <= 90.50 | | | | |--- no_of_days_stayed <= 4.50 | | | | | |--- no_of_days_stayed <= 3.50 | | | | | | |--- weights: [1259.24, 0.00] class: 0 | | | | | |--- no_of_days_stayed > 3.50 | | | | | | |--- room_type_reserved_Room_Type 6 <= 0.50 | | | | | | | |--- avg_price_per_room <= 90.05 | | | | | | | | |--- lead_time <= 48.00 | | | | | | | | | |--- arrival_month <= 2.50 | | | | | | | | | | |--- lead_time <= 20.00 | | | | | | | | | | | |--- weights: [0.00, 1.52] class: 1 | | | | | | | | | | |--- lead_time > 20.00 | | | | | | | | | | | |--- weights: [2.98, 0.00] class: 0 | | | | | | | | | |--- arrival_month > 2.50 | | | | | | | | | | |--- weights: [45.48, 0.00] class: 0 | | | | | | | | |--- lead_time > 48.00 | | | | | | | | | |--- avg_price_per_room <= 89.85 | | | | | | | | | | |--- arrival_date <= 14.50 | | | | | | | | | | | |--- weights: [13.42, 0.00] class: 0 | | | | | | | | | | |--- arrival_date > 14.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | |--- avg_price_per_room > 89.85 | | | | | | | | | | |--- weights: [0.00, 1.52] class: 1 | | | | | | | |--- avg_price_per_room > 90.05 | | | | | | | | |--- type_of_meal_plan_Not Selected <= 0.50 | | | | | | | | | |--- arrival_date <= 29.50 | | | | | | | | | | |--- weights: [211.74, 0.00] class: 0 | | | | | | | | | |--- arrival_date > 29.50 | | | | | | | | | | |--- lead_time <= 54.50 | | | | | | | | | | | |--- weights: [12.67, 0.00] class: 0 | | | | | | | | | | |--- lead_time > 54.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | |--- type_of_meal_plan_Not Selected > 0.50 | | | | | | | | | |--- lead_time <= 28.50 | | | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | | | |--- weights: [10.44, 0.00] class: 0 | | | | | | | | | |--- lead_time > 28.50 | | | | | | | | | | |--- lead_time <= 30.50 | | | | | | | | | | | |--- weights: [0.00, 1.52] class: 1 | | | | | | | | | | |--- lead_time > 30.50 | | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | |--- room_type_reserved_Room_Type 6 > 0.50 | | | | | | | |--- lead_time <= 31.00 | | | | | | | | |--- weights: [9.69, 0.00] class: 0 | | | | | | | |--- lead_time > 31.00 | | | | | | | | |--- arrival_date <= 4.50 | | | | | | | | | |--- weights: [1.49, 0.00] class: 0 | | | | | | | | |--- arrival_date > 4.50 | | | | | | | | | |--- weights: [0.00, 3.04] class: 1 | | | | |--- no_of_days_stayed > 4.50 | | | | | |--- no_of_days_stayed <= 12.00 | | | | | | |--- no_of_special_requests <= 2.50 | | | | | | | |--- no_of_days_stayed <= 6.50 | | | | | | | | |--- avg_price_per_room <= 144.28 | | | | | | | | | |--- avg_price_per_room <= 134.74 | | | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | | | |--- truncated branch of depth 9 | | | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | |--- avg_price_per_room > 134.74 | | | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | | | |--- weights: [1.49, 0.00] class: 0 | | | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | |--- avg_price_per_room > 144.28 | | | | | | | | | |--- weights: [35.79, 0.00] class: 0 | | | | | | | |--- no_of_days_stayed > 6.50 | | | | | | | | |--- lead_time <= 9.00 | | | | | | | | | |--- weights: [9.69, 0.00] class: 0 | | | | | | | | |--- lead_time > 9.00 | | | | | | | | | |--- lead_time <= 34.50 | | | | | | | | | | |--- arrival_date <= 9.50 | | | | | | | | | | | |--- weights: [4.47, 0.00] class: 0 | | | | | | | | | | |--- arrival_date > 9.50 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | |--- lead_time > 34.50 | | | | | | | | | | |--- lead_time <= 72.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- lead_time > 72.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | |--- no_of_special_requests > 2.50 | | | | | | | |--- weights: [51.44, 0.00] class: 0 | | | | | |--- no_of_days_stayed > 12.00 | | | | | | |--- weights: [0.00, 3.04] class: 1 | | | |--- lead_time > 90.50 | | | | |--- no_of_special_requests <= 2.50 | | | | | |--- arrival_month <= 8.50 | | | | | | |--- avg_price_per_room <= 202.95 | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | |--- arrival_month <= 7.50 | | | | | | | | | |--- arrival_date <= 4.50 | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | | |--- arrival_date > 4.50 | | | | | | | | | | |--- arrival_date <= 26.00 | | | | | | | | | | | |--- weights: [0.00, 7.59] class: 1 | | | | | | | | | | |--- arrival_date > 26.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | |--- arrival_month > 7.50 | | | | | | | | | |--- arrival_date <= 24.50 | | | | | | | | | | |--- lead_time <= 98.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- lead_time > 98.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- arrival_date > 24.50 | | | | | | | | | | |--- weights: [0.00, 1.52] class: 1 | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | |--- lead_time <= 150.50 | | | | | | | | | |--- no_of_days_stayed <= 5.50 | | | | | | | | | | |--- no_of_days_stayed <= 3.50 | | | | | | | | | | | |--- truncated branch of depth 8 | | | | | | | | | | |--- no_of_days_stayed > 3.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | |--- no_of_days_stayed > 5.50 | | | | | | | | | | |--- arrival_date <= 5.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- arrival_date > 5.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | |--- lead_time > 150.50 | | | | | | | | | |--- arrival_date <= 19.50 | | | | | | | | | | |--- weights: [0.00, 3.04] class: 1 | | | | | | | | | |--- arrival_date > 19.50 | | | | | | | | | | |--- weights: [0.00, 1.52] class: 1 | | | | | | |--- avg_price_per_room > 202.95 | | | | | | | |--- room_type_reserved_Room_Type 4 <= 0.50 | | | | | | | | |--- weights: [0.00, 7.59] class: 1 | | | | | | | |--- room_type_reserved_Room_Type 4 > 0.50 | | | | | | | | |--- weights: [0.00, 3.04] class: 1 | | | | | |--- arrival_month > 8.50 | | | | | | |--- avg_price_per_room <= 153.15 | | | | | | | |--- room_type_reserved_Room_Type 2 <= 0.50 | | | | | | | | |--- avg_price_per_room <= 71.12 | | | | | | | | | |--- weights: [3.73, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 71.12 | | | | | | | | | |--- avg_price_per_room <= 90.42 | | | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | |--- avg_price_per_room > 90.42 | | | | | | | | | | |--- no_of_people <= 1.50 | | | | | | | | | | | |--- weights: [5.96, 0.00] class: 0 | | | | | | | | | | |--- no_of_people > 1.50 | | | | | | | | | | | |--- truncated branch of depth 16 | | | | | | | |--- room_type_reserved_Room_Type 2 > 0.50 | | | | | | | | |--- weights: [5.96, 0.00] class: 0 | | | | | | |--- avg_price_per_room > 153.15 | | | | | | | |--- arrival_date <= 22.50 | | | | | | | | |--- no_of_people <= 1.50 | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | |--- no_of_people > 1.50 | | | | | | | | | |--- weights: [7.46, 0.00] class: 0 | | | | | | | |--- arrival_date > 22.50 | | | | | | | | |--- lead_time <= 106.50 | | | | | | | | | |--- weights: [0.00, 1.52] class: 1 | | | | | | | | |--- lead_time > 106.50 | | | | | | | | | |--- arrival_date <= 23.50 | | | | | | | | | | |--- weights: [0.00, 1.52] class: 1 | | | | | | | | | |--- arrival_date > 23.50 | | | | | | | | | | |--- weights: [4.47, 0.00] class: 0 | | | | |--- no_of_special_requests > 2.50 | | | | | |--- weights: [67.10, 0.00] class: 0 |--- lead_time > 151.50 | |--- avg_price_per_room <= 100.04 | | |--- no_of_special_requests <= 0.50 | | | |--- no_of_people <= 1.50 | | | | |--- market_segment_type_Online <= 0.50 | | | | | |--- lead_time <= 163.50 | | | | | | |--- arrival_month <= 5.00 | | | | | | | |--- weights: [2.98, 0.00] class: 0 | | | | | | |--- arrival_month > 5.00 | | | | | | | |--- lead_time <= 162.50 | | | | | | | | |--- weights: [0.75, 1.52] class: 1 | | | | | | | |--- lead_time > 162.50 | | | | | | | | |--- weights: [0.00, 22.77] class: 1 | | | | | |--- lead_time > 163.50 | | | | | | |--- lead_time <= 341.00 | | | | | | | |--- lead_time <= 173.00 | | | | | | | | |--- arrival_date <= 3.50 | | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | | |--- no_of_days_stayed <= 1.50 | | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | | | |--- no_of_days_stayed > 1.50 | | | | | | | | | | | |--- weights: [45.48, 9.11] class: 0 | | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | |--- arrival_date > 3.50 | | | | | | | | | |--- no_of_days_stayed <= 3.00 | | | | | | | | | | |--- weights: [0.00, 13.66] class: 1 | | | | | | | | | |--- no_of_days_stayed > 3.00 | | | | | | | | | | |--- weights: [2.24, 0.00] class: 0 | | | | | | | |--- lead_time > 173.00 | | | | | | | | |--- arrival_month <= 5.50 | | | | | | | | | |--- arrival_date <= 7.50 | | | | | | | | | | |--- weights: [0.00, 4.55] class: 1 | | | | | | | | | |--- arrival_date > 7.50 | | | | | | | | | | |--- weights: [6.71, 0.00] class: 0 | | | | | | | | |--- arrival_month > 5.50 | | | | | | | | | |--- avg_price_per_room <= 55.21 | | | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | | | |--- weights: [0.00, 1.52] class: 1 | | | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | | |--- avg_price_per_room > 55.21 | | | | | | | | | | |--- avg_price_per_room <= 98.00 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | | |--- avg_price_per_room > 98.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | |--- lead_time > 341.00 | | | | | | | |--- arrival_date <= 8.50 | | | | | | | | |--- avg_price_per_room <= 88.33 | | | | | | | | | |--- weights: [0.00, 10.63] class: 1 | | | | | | | | |--- avg_price_per_room > 88.33 | | | | | | | | | |--- weights: [0.75, 1.52] class: 1 | | | | | | | |--- arrival_date > 8.50 | | | | | | | | |--- lead_time <= 402.00 | | | | | | | | | |--- avg_price_per_room <= 80.00 | | | | | | | | | | |--- weights: [3.73, 0.00] class: 0 | | | | | | | | | |--- avg_price_per_room > 80.00 | | | | | | | | | | |--- arrival_date <= 18.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- arrival_date > 18.50 | | | | | | | | | | | |--- weights: [2.24, 3.04] class: 1 | | | | | | | | |--- lead_time > 402.00 | | | | | | | | | |--- weights: [0.00, 4.55] class: 1 | | | | |--- market_segment_type_Online > 0.50 | | | | | |--- avg_price_per_room <= 2.50 | | | | | | |--- lead_time <= 285.50 | | | | | | | |--- arrival_date <= 5.00 | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | |--- arrival_date > 5.00 | | | | | | | | |--- weights: [7.46, 0.00] class: 0 | | | | | | |--- lead_time > 285.50 | | | | | | | |--- arrival_month <= 9.50 | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | |--- arrival_month > 9.50 | | | | | | | | |--- weights: [0.00, 3.04] class: 1 | | | | | |--- avg_price_per_room > 2.50 | | | | | | |--- arrival_date <= 29.50 | | | | | | | |--- weights: [0.00, 74.39] class: 1 | | | | | | |--- arrival_date > 29.50 | | | | | | | |--- arrival_month <= 11.00 | | | | | | | | |--- weights: [0.00, 6.07] class: 1 | | | | | | | |--- arrival_month > 11.00 | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | |--- no_of_people > 1.50 | | | | |--- avg_price_per_room <= 82.47 | | | | | |--- market_segment_type_Offline <= 0.50 | | | | | | |--- arrival_month <= 11.50 | | | | | | | |--- weights: [0.00, 201.91] class: 1 | | | | | | |--- arrival_month > 11.50 | | | | | | | |--- no_of_days_stayed <= 1.50 | | | | | | | | |--- arrival_date <= 14.50 | | | | | | | | | |--- weights: [0.00, 1.52] class: 1 | | | | | | | | |--- arrival_date > 14.50 | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | |--- no_of_days_stayed > 1.50 | | | | | | | | |--- avg_price_per_room <= 80.51 | | | | | | | | | |--- no_of_days_stayed <= 3.50 | | | | | | | | | | |--- type_of_meal_plan_Not Selected <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- type_of_meal_plan_Not Selected > 0.50 | | | | | | | | | | | |--- weights: [0.00, 19.74] class: 1 | | | | | | | | | |--- no_of_days_stayed > 3.50 | | | | | | | | | | |--- weights: [0.00, 57.69] class: 1 | | | | | | | | |--- avg_price_per_room > 80.51 | | | | | | | | | |--- lead_time <= 236.00 | | | | | | | | | | |--- weights: [0.00, 3.04] class: 1 | | | | | | | | | |--- lead_time > 236.00 | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | |--- market_segment_type_Offline > 0.50 | | | | | | |--- arrival_month <= 11.50 | | | | | | | |--- lead_time <= 244.00 | | | | | | | | |--- no_of_days_stayed <= 2.50 | | | | | | | | | |--- lead_time <= 166.50 | | | | | | | | | | |--- weights: [2.24, 0.00] class: 0 | | | | | | | | | |--- lead_time > 166.50 | | | | | | | | | | |--- arrival_date <= 19.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- arrival_date > 19.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | |--- no_of_days_stayed > 2.50 | | | | | | | | | |--- arrival_date <= 11.50 | | | | | | | | | | |--- type_of_meal_plan_Meal Plan 2 <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | | |--- type_of_meal_plan_Meal Plan 2 > 0.50 | | | | | | | | | | | |--- weights: [0.00, 1.52] class: 1 | | | | | | | | | |--- arrival_date > 11.50 | | | | | | | | | | |--- arrival_date <= 15.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- arrival_date > 15.50 | | | | | | | | | | | |--- truncated branch of depth 7 | | | | | | | |--- lead_time > 244.00 | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | |--- weights: [25.35, 0.00] class: 0 | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | |--- avg_price_per_room <= 80.38 | | | | | | | | | | |--- avg_price_per_room <= 76.00 | | | | | | | | | | | |--- truncated branch of depth 9 | | | | | | | | | | |--- avg_price_per_room > 76.00 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | |--- avg_price_per_room > 80.38 | | | | | | | | | | |--- weights: [7.46, 0.00] class: 0 | | | | | | |--- arrival_month > 11.50 | | | | | | | |--- weights: [46.97, 0.00] class: 0 | | | | |--- avg_price_per_room > 82.47 | | | | | |--- lead_time <= 324.50 | | | | | | |--- arrival_month <= 11.50 | | | | | | | |--- room_type_reserved_Room_Type 4 <= 0.50 | | | | | | | | |--- no_of_people <= 2.50 | | | | | | | | | |--- market_segment_type_Corporate <= 0.50 | | | | | | | | | | |--- market_segment_type_Online <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 7 | | | | | | | | | | |--- market_segment_type_Online > 0.50 | | | | | | | | | | | |--- weights: [0.00, 505.53] class: 1 | | | | | | | | | |--- market_segment_type_Corporate > 0.50 | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | |--- no_of_people > 2.50 | | | | | | | | | |--- market_segment_type_Online <= 0.50 | | | | | | | | | | |--- weights: [2.98, 0.00] class: 0 | | | | | | | | | |--- market_segment_type_Online > 0.50 | | | | | | | | | | |--- weights: [0.00, 4.55] class: 1 | | | | | | | |--- room_type_reserved_Room_Type 4 > 0.50 | | | | | | | | |--- market_segment_type_Offline <= 0.50 | | | | | | | | | |--- weights: [0.00, 10.63] class: 1 | | | | | | | | |--- market_segment_type_Offline > 0.50 | | | | | | | | | |--- arrival_date <= 5.50 | | | | | | | | | | |--- weights: [1.49, 0.00] class: 0 | | | | | | | | | |--- arrival_date > 5.50 | | | | | | | | | | |--- weights: [2.98, 0.00] class: 0 | | | | | | |--- arrival_month > 11.50 | | | | | | | |--- market_segment_type_Online <= 0.50 | | | | | | | | |--- weights: [7.46, 0.00] class: 0 | | | | | | | |--- market_segment_type_Online > 0.50 | | | | | | | | |--- weights: [0.00, 19.74] class: 1 | | | | | |--- lead_time > 324.50 | | | | | | |--- avg_price_per_room <= 89.00 | | | | | | | |--- weights: [5.96, 0.00] class: 0 | | | | | | |--- avg_price_per_room > 89.00 | | | | | | | |--- market_segment_type_Offline <= 0.50 | | | | | | | | |--- weights: [0.00, 6.07] class: 1 | | | | | | | |--- market_segment_type_Offline > 0.50 | | | | | | | | |--- weights: [0.75, 7.59] class: 1 | | |--- no_of_special_requests > 0.50 | | | |--- market_segment_type_Offline <= 0.50 | | | | |--- lead_time <= 180.50 | | | | | |--- lead_time <= 159.50 | | | | | | |--- arrival_month <= 8.50 | | | | | | | |--- lead_time <= 152.50 | | | | | | | | |--- arrival_date <= 23.00 | | | | | | | | | |--- arrival_date <= 4.50 | | | | | | | | | | |--- weights: [0.00, 3.04] class: 1 | | | | | | | | | |--- arrival_date > 4.50 | | | | | | | | | | |--- arrival_date <= 13.00 | | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | | | |--- arrival_date > 13.00 | | | | | | | | | | | |--- weights: [0.00, 1.52] class: 1 | | | | | | | | |--- arrival_date > 23.00 | | | | | | | | | |--- room_type_reserved_Room_Type 2 <= 0.50 | | | | | | | | | | |--- weights: [1.49, 0.00] class: 0 | | | | | | | | | |--- room_type_reserved_Room_Type 2 > 0.50 | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | |--- lead_time > 152.50 | | | | | | | | |--- lead_time <= 156.50 | | | | | | | | | |--- weights: [8.95, 0.00] class: 0 | | | | | | | | |--- lead_time > 156.50 | | | | | | | | | |--- arrival_date <= 29.50 | | | | | | | | | | |--- arrival_date <= 10.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- arrival_date > 10.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- arrival_date > 29.50 | | | | | | | | | | |--- weights: [0.00, 1.52] class: 1 | | | | | | |--- arrival_month > 8.50 | | | | | | | |--- arrival_date <= 23.50 | | | | | | | | |--- avg_price_per_room <= 87.12 | | | | | | | | | |--- no_of_special_requests <= 1.50 | | | | | | | | | | |--- weights: [0.00, 1.52] class: 1 | | | | | | | | | |--- no_of_special_requests > 1.50 | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 87.12 | | | | | | | | | |--- arrival_month <= 11.00 | | | | | | | | | | |--- weights: [1.49, 0.00] class: 0 | | | | | | | | | |--- arrival_month > 11.00 | | | | | | | | | | |--- weights: [1.49, 0.00] class: 0 | | | | | | | |--- arrival_date > 23.50 | | | | | | | | |--- arrival_date <= 24.50 | | | | | | | | | |--- weights: [0.00, 1.52] class: 1 | | | | | | | | |--- arrival_date > 24.50 | | | | | | | | | |--- weights: [0.00, 10.63] class: 1 | | | | | |--- lead_time > 159.50 | | | | | | |--- avg_price_per_room <= 93.44 | | | | | | | |--- arrival_date <= 28.50 | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | |--- arrival_date <= 25.50 | | | | | | | | | | |--- weights: [2.24, 0.00] class: 0 | | | | | | | | | |--- arrival_date > 25.50 | | | | | | | | | | |--- weights: [0.00, 1.52] class: 1 | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | |--- no_of_days_stayed <= 1.50 | | | | | | | | | | |--- no_of_special_requests <= 1.50 | | | | | | | | | | | |--- weights: [0.00, 1.52] class: 1 | | | | | | | | | | |--- no_of_special_requests > 1.50 | | | | | | | | | | | |--- weights: [5.22, 0.00] class: 0 | | | | | | | | | |--- no_of_days_stayed > 1.50 | | | | | | | | | | |--- weights: [48.46, 0.00] class: 0 | | | | | | | |--- arrival_date > 28.50 | | | | | | | | |--- avg_price_per_room <= 85.17 | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 85.17 | | | | | | | | | |--- weights: [0.00, 3.04] class: 1 | | | | | | |--- avg_price_per_room > 93.44 | | | | | | | |--- lead_time <= 178.50 | | | | | | | | |--- avg_price_per_room <= 95.75 | | | | | | | | | |--- no_of_people <= 2.50 | | | | | | | | | | |--- type_of_meal_plan_Not Selected <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- type_of_meal_plan_Not Selected > 0.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | |--- no_of_people > 2.50 | | | | | | | | | | |--- weights: [2.98, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 95.75 | | | | | | | | | |--- arrival_date <= 9.50 | | | | | | | | | | |--- arrival_date <= 6.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- arrival_date > 6.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- arrival_date > 9.50 | | | | | | | | | | |--- weights: [18.64, 0.00] class: 0 | | | | | | | |--- lead_time > 178.50 | | | | | | | | |--- lead_time <= 179.50 | | | | | | | | | |--- weights: [0.00, 4.55] class: 1 | | | | | | | | |--- lead_time > 179.50 | | | | | | | | | |--- arrival_date <= 12.00 | | | | | | | | | | |--- weights: [2.98, 1.52] class: 0 | | | | | | | | | |--- arrival_date > 12.00 | | | | | | | | | | |--- no_of_special_requests <= 1.50 | | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | | | |--- no_of_special_requests > 1.50 | | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | |--- lead_time > 180.50 | | | | | |--- no_of_days_stayed <= 3.50 | | | | | | |--- no_of_special_requests <= 2.50 | | | | | | | |--- avg_price_per_room <= 66.47 | | | | | | | | |--- weights: [2.24, 0.00] class: 0 | | | | | | | |--- avg_price_per_room > 66.47 | | | | | | | | |--- lead_time <= 187.50 | | | | | | | | | |--- arrival_month <= 4.00 | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | | |--- arrival_month > 4.00 | | | | | | | | | | |--- avg_price_per_room <= 78.30 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- avg_price_per_room > 78.30 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | |--- lead_time > 187.50 | | | | | | | | | |--- lead_time <= 304.50 | | | | | | | | | | |--- avg_price_per_room <= 99.30 | | | | | | | | | | | |--- truncated branch of depth 14 | | | | | | | | | | |--- avg_price_per_room > 99.30 | | | | | | | | | | | |--- weights: [1.49, 0.00] class: 0 | | | | | | | | | |--- lead_time > 304.50 | | | | | | | | | | |--- arrival_month <= 9.00 | | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | | | |--- arrival_month > 9.00 | | | | | | | | | | | |--- weights: [0.00, 25.81] class: 1 | | | | | | |--- no_of_special_requests > 2.50 | | | | | | | |--- weights: [8.20, 0.00] class: 0 | | | | | |--- no_of_days_stayed > 3.50 | | | | | | |--- arrival_year <= 2017.50 | | | | | | | |--- weights: [14.17, 0.00] class: 0 | | | | | | |--- arrival_year > 2017.50 | | | | | | | |--- no_of_days_stayed <= 11.50 | | | | | | | | |--- avg_price_per_room <= 69.40 | | | | | | | | | |--- avg_price_per_room <= 64.43 | | | | | | | | | | |--- avg_price_per_room <= 55.92 | | | | | | | | | | | |--- weights: [5.22, 0.00] class: 0 | | | | | | | | | | |--- avg_price_per_room > 55.92 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | |--- avg_price_per_room > 64.43 | | | | | | | | | | |--- weights: [8.20, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 69.40 | | | | | | | | | |--- no_of_special_requests <= 2.50 | | | | | | | | | | |--- arrival_month <= 10.50 | | | | | | | | | | | |--- truncated branch of depth 12 | | | | | | | | | | |--- arrival_month > 10.50 | | | | | | | | | | | |--- truncated branch of depth 13 | | | | | | | | | |--- no_of_special_requests > 2.50 | | | | | | | | | | |--- weights: [5.22, 0.00] class: 0 | | | | | | | |--- no_of_days_stayed > 11.50 | | | | | | | | |--- lead_time <= 198.00 | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | |--- lead_time > 198.00 | | | | | | | | | |--- weights: [0.00, 10.63] class: 1 | | | |--- market_segment_type_Offline > 0.50 | | | | |--- lead_time <= 348.50 | | | | | |--- no_of_people <= 2.50 | | | | | | |--- no_of_days_stayed <= 7.50 | | | | | | | |--- arrival_date <= 30.50 | | | | | | | | |--- lead_time <= 331.00 | | | | | | | | | |--- weights: [106.61, 0.00] class: 0 | | | | | | | | |--- lead_time > 331.00 | | | | | | | | | |--- lead_time <= 336.50 | | | | | | | | | | |--- weights: [1.49, 1.52] class: 1 | | | | | | | | | |--- lead_time > 336.50 | | | | | | | | | | |--- avg_price_per_room <= 68.00 | | | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | | | | | |--- avg_price_per_room > 68.00 | | | | | | | | | | | |--- weights: [5.22, 0.00] class: 0 | | | | | | | |--- arrival_date > 30.50 | | | | | | | | |--- no_of_days_stayed <= 5.00 | | | | | | | | | |--- weights: [2.24, 0.00] class: 0 | | | | | | | | |--- no_of_days_stayed > 5.00 | | | | | | | | | |--- weights: [1.49, 1.52] class: 1 | | | | | | |--- no_of_days_stayed > 7.50 | | | | | | | |--- no_of_special_requests <= 1.50 | | | | | | | | |--- weights: [1.49, 0.00] class: 0 | | | | | | | |--- no_of_special_requests > 1.50 | | | | | | | | |--- weights: [0.00, 1.52] class: 1 | | | | | |--- no_of_people > 2.50 | | | | | | |--- lead_time <= 196.00 | | | | | | | |--- weights: [5.96, 0.00] class: 0 | | | | | | |--- lead_time > 196.00 | | | | | | | |--- avg_price_per_room <= 93.12 | | | | | | | | |--- weights: [1.49, 0.00] class: 0 | | | | | | | |--- avg_price_per_room > 93.12 | | | | | | | | |--- no_of_days_stayed <= 5.00 | | | | | | | | | |--- weights: [0.00, 3.04] class: 1 | | | | | | | | |--- no_of_days_stayed > 5.00 | | | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | |--- lead_time > 348.50 | | | | | |--- avg_price_per_room <= 84.00 | | | | | | |--- avg_price_per_room <= 58.50 | | | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | | | |--- avg_price_per_room > 58.50 | | | | | | | |--- weights: [4.47, 3.04] class: 0 | | | | | |--- avg_price_per_room > 84.00 | | | | | | |--- lead_time <= 381.50 | | | | | | | |--- weights: [0.00, 1.52] class: 1 | | | | | | |--- lead_time > 381.50 | | | | | | | |--- weights: [0.75, 1.52] class: 1 | |--- avg_price_per_room > 100.04 | | |--- arrival_month <= 11.50 | | | |--- no_of_special_requests <= 2.50 | | | | |--- room_type_reserved_Room_Type 6 <= 0.50 | | | | | |--- weights: [0.00, 3103.03] class: 1 | | | | |--- room_type_reserved_Room_Type 6 > 0.50 | | | | | |--- weights: [0.00, 97.16] class: 1 | | | |--- no_of_special_requests > 2.50 | | | | |--- weights: [23.11, 0.00] class: 0 | | |--- arrival_month > 11.50 | | | |--- no_of_special_requests <= 0.50 | | | | |--- arrival_year <= 2017.50 | | | | | |--- weights: [0.75, 0.00] class: 0 | | | | |--- arrival_year > 2017.50 | | | | | |--- weights: [34.30, 0.00] class: 0 | | | |--- no_of_special_requests > 0.50 | | | | |--- arrival_date <= 24.50 | | | | | |--- weights: [3.73, 0.00] class: 0 | | | | |--- arrival_date > 24.50 | | | | | |--- no_of_people <= 2.50 | | | | | | |--- weights: [1.49, 0.00] class: 0 | | | | | |--- no_of_people > 2.50 | | | | | | |--- lead_time <= 172.50 | | | | | | | |--- avg_price_per_room <= 135.49 | | | | | | | | |--- weights: [2.24, 0.00] class: 0 | | | | | | | |--- avg_price_per_room > 135.49 | | | | | | | | |--- weights: [0.00, 1.52] class: 1 | | | | | | |--- lead_time > 172.50 | | | | | | | |--- weights: [0.00, 21.25] class: 1
# Importance of features in the tree building ( The importance of a feature is computed as the
# (normalized) total reduction of the criterion brought by that feature. It is also known as the Gini importance )
print(
pd.DataFrame(
model.feature_importances_, columns=["Imp"], index=X_train.columns
).sort_values(by="Imp", ascending=False)
)
Imp lead_time 3.595298e-01 avg_price_per_room 1.551888e-01 market_segment_type_Online 9.406970e-02 arrival_date 9.370397e-02 no_of_special_requests 8.657037e-02 arrival_month 6.692246e-02 no_of_days_stayed 5.449526e-02 no_of_people 3.232946e-02 arrival_year 1.542329e-02 market_segment_type_Offline 9.894276e-03 required_car_parking_space 7.758533e-03 room_type_reserved_Room_Type 4 7.357731e-03 type_of_meal_plan_Not Selected 6.934528e-03 room_type_reserved_Room_Type 2 2.714500e-03 type_of_meal_plan_Meal Plan 2 2.570490e-03 room_type_reserved_Room_Type 5 1.395219e-03 repeated_guest 1.077451e-03 room_type_reserved_Room_Type 6 8.055942e-04 market_segment_type_Corporate 7.862352e-04 room_type_reserved_Room_Type 7 3.823624e-04 no_of_previous_bookings_not_canceled 8.993482e-05 no_of_previous_cancellations 2.333997e-18 type_of_meal_plan_Meal Plan 3 0.000000e+00 room_type_reserved_Room_Type 3 0.000000e+00 market_segment_type_Complementary 0.000000e+00 const 0.000000e+00
# Feature Importance plot showing the features and their importance for in determining booking status
importances = model.feature_importances_
indices = np.argsort(importances)
plt.figure(figsize=(12, 12))
plt.title("Feature Importance")
plt.barh(range(len(indices)), importances[indices], color="violet", align="center")
plt.yticks(range(len(indices)), [feature_names[i] for i in indices])
plt.xlabel("Relative Importance")
plt.show()
# Choose the type of classifier.
estimator = DecisionTreeClassifier(random_state=1, class_weight = "balanced")
# Grid of parameters to choose from
parameters = {
"max_depth": [np.arange(2, 50, 5), None],
"criterion": ["entropy", "gini"],
"splitter": ["best", "random"],
"min_impurity_decrease": [0.000001, 0.00001, 0.0001],
}
# Type of scoring used to compare parameter combinations
acc_scorer = make_scorer(recall_score)
# Run the grid search
grid_obj = GridSearchCV(estimator, parameters, scoring=acc_scorer, cv=5)
grid_obj = grid_obj.fit(X_train, y_train)
# Set the clf to the best combination of parameters
estimator = grid_obj.best_estimator_
# Fit the best algorithm to the data.
estimator.fit(X_train, y_train)
DecisionTreeClassifier(class_weight='balanced', min_impurity_decrease=0.0001,
random_state=1)
decision_tree_tune_perf_train = model_performance_classification_sklearn(
estimator, X_train, y_train
)
decision_tree_tune_perf_train
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.882443 | 0.918331 | 0.769385 | 0.837285 |
confusion_matrix_sklearn(estimator, X_train, y_train)
decision_tree_tune_perf_test = model_performance_classification_sklearn(
estimator, X_test, y_test
)
decision_tree_tune_perf_test
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.846733 | 0.871664 | 0.716286 | 0.786373 |
confusion_matrix_sklearn(estimator, X_test, y_test)
# Plotting decision tree with hyperparameters tuned
plt.figure(figsize=(40, 50))
tree.plot_tree(
estimator,
feature_names=feature_names,
filled=True,
fontsize=9,
node_ids=True,
class_names=True,
)
plt.show()
# Text report showing the rules of a decision tree
print(tree.export_text(estimator, feature_names=feature_names, show_weights=True))
|--- lead_time <= 151.50 | |--- no_of_special_requests <= 0.50 | | |--- market_segment_type_Online <= 0.50 | | | |--- lead_time <= 90.50 | | | | |--- no_of_days_stayed <= 5.50 | | | | | |--- avg_price_per_room <= 201.50 | | | | | | |--- lead_time <= 74.50 | | | | | | | |--- arrival_month <= 5.50 | | | | | | | | |--- arrival_date <= 27.50 | | | | | | | | | |--- lead_time <= 59.50 | | | | | | | | | | |--- market_segment_type_Offline <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 7 | | | | | | | | | | |--- market_segment_type_Offline > 0.50 | | | | | | | | | | | |--- weights: [383.21, 16.70] class: 0 | | | | | | | | | |--- lead_time > 59.50 | | | | | | | | | | |--- arrival_date <= 16.50 | | | | | | | | | | | |--- weights: [19.38, 0.00] class: 0 | | | | | | | | | | |--- arrival_date > 16.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | |--- arrival_date > 27.50 | | | | | | | | | |--- avg_price_per_room <= 61.00 | | | | | | | | | | |--- avg_price_per_room <= 59.75 | | | | | | | | | | | |--- weights: [8.95, 1.52] class: 0 | | | | | | | | | | |--- avg_price_per_room > 59.75 | | | | | | | | | | | |--- weights: [0.00, 50.10] class: 1 | | | | | | | | | |--- avg_price_per_room > 61.00 | | | | | | | | | | |--- weights: [44.73, 6.07] class: 0 | | | | | | | |--- arrival_month > 5.50 | | | | | | | | |--- market_segment_type_Offline <= 0.50 | | | | | | | | | |--- repeated_guest <= 0.50 | | | | | | | | | | |--- room_type_reserved_Room_Type 4 <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- room_type_reserved_Room_Type 4 > 0.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | |--- repeated_guest > 0.50 | | | | | | | | | | |--- weights: [132.71, 0.00] class: 0 | | | | | | | | |--- market_segment_type_Offline > 0.50 | | | | | | | | | |--- avg_price_per_room <= 50.00 | | | | | | | | | | |--- arrival_month <= 9.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- arrival_month > 9.50 | | | | | | | | | | | |--- weights: [14.17, 0.00] class: 0 | | | | | | | | | |--- avg_price_per_room > 50.00 | | | | | | | | | | |--- weights: [1302.48, 51.62] class: 0 | | | | | | |--- lead_time > 74.50 | | | | | | | |--- lead_time <= 78.50 | | | | | | | | |--- avg_price_per_room <= 79.78 | | | | | | | | | |--- weights: [14.91, 1.52] class: 0 | | | | | | | | |--- avg_price_per_room > 79.78 | | | | | | | | | |--- weights: [10.44, 57.69] class: 1 | | | | | | | |--- lead_time > 78.50 | | | | | | | | |--- no_of_days_stayed <= 3.50 | | | | | | | | | |--- market_segment_type_Corporate <= 0.50 | | | | | | | | | | |--- weights: [133.45, 10.63] class: 0 | | | | | | | | | |--- market_segment_type_Corporate > 0.50 | | | | | | | | | | |--- weights: [19.38, 9.11] class: 0 | | | | | | | | |--- no_of_days_stayed > 3.50 | | | | | | | | | |--- arrival_date <= 24.50 | | | | | | | | | | |--- weights: [17.15, 6.07] class: 0 | | | | | | | | | |--- arrival_date > 24.50 | | | | | | | | | | |--- weights: [2.24, 12.14] class: 1 | | | | | |--- avg_price_per_room > 201.50 | | | | | | |--- arrival_month <= 10.50 | | | | | | | |--- weights: [0.00, 25.81] class: 1 | | | | | | |--- arrival_month > 10.50 | | | | | | | |--- weights: [1.49, 0.00] class: 0 | | | | |--- no_of_days_stayed > 5.50 | | | | | |--- avg_price_per_room <= 92.80 | | | | | | |--- weights: [68.59, 12.14] class: 0 | | | | | |--- avg_price_per_room > 92.80 | | | | | | |--- arrival_month <= 8.50 | | | | | | | |--- arrival_date <= 21.00 | | | | | | | | |--- weights: [4.47, 85.01] class: 1 | | | | | | | |--- arrival_date > 21.00 | | | | | | | | |--- weights: [5.22, 0.00] class: 0 | | | | | | |--- arrival_month > 8.50 | | | | | | | |--- weights: [7.46, 0.00] class: 0 | | | |--- lead_time > 90.50 | | | | |--- lead_time <= 117.50 | | | | | |--- avg_price_per_room <= 93.58 | | | | | | |--- avg_price_per_room <= 75.07 | | | | | | | |--- arrival_month <= 7.50 | | | | | | | | |--- avg_price_per_room <= 58.75 | | | | | | | | | |--- weights: [10.44, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 58.75 | | | | | | | | | |--- no_of_days_stayed <= 3.50 | | | | | | | | | | |--- lead_time <= 104.50 | | | | | | | | | | | |--- weights: [5.22, 6.07] class: 1 | | | | | | | | | | |--- lead_time > 104.50 | | | | | | | | | | | |--- weights: [2.98, 126.00] class: 1 | | | | | | | | | |--- no_of_days_stayed > 3.50 | | | | | | | | | | |--- arrival_date <= 23.50 | | | | | | | | | | | |--- weights: [9.69, 1.52] class: 0 | | | | | | | | | | |--- arrival_date > 23.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | |--- arrival_month > 7.50 | | | | | | | | |--- arrival_date <= 29.50 | | | | | | | | | |--- weights: [49.21, 7.59] class: 0 | | | | | | | | |--- arrival_date > 29.50 | | | | | | | | | |--- lead_time <= 98.00 | | | | | | | | | | |--- weights: [4.47, 0.00] class: 0 | | | | | | | | | |--- lead_time > 98.00 | | | | | | | | | | |--- weights: [0.75, 13.66] class: 1 | | | | | | |--- avg_price_per_room > 75.07 | | | | | | | |--- arrival_month <= 3.50 | | | | | | | | |--- weights: [59.64, 3.04] class: 0 | | | | | | | |--- arrival_month > 3.50 | | | | | | | | |--- arrival_month <= 4.50 | | | | | | | | | |--- avg_price_per_room <= 80.38 | | | | | | | | | | |--- weights: [0.00, 16.70] class: 1 | | | | | | | | | |--- avg_price_per_room > 80.38 | | | | | | | | | | |--- weights: [1.49, 0.00] class: 0 | | | | | | | | |--- arrival_month > 4.50 | | | | | | | | | |--- no_of_people <= 1.50 | | | | | | | | | | |--- avg_price_per_room <= 86.00 | | | | | | | | | | | |--- weights: [2.24, 16.70] class: 1 | | | | | | | | | | |--- avg_price_per_room > 86.00 | | | | | | | | | | | |--- weights: [8.95, 3.04] class: 0 | | | | | | | | | |--- no_of_people > 1.50 | | | | | | | | | | |--- arrival_date <= 22.50 | | | | | | | | | | | |--- weights: [44.73, 4.55] class: 0 | | | | | | | | | | |--- arrival_date > 22.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | |--- avg_price_per_room > 93.58 | | | | | | |--- arrival_date <= 11.50 | | | | | | | |--- arrival_month <= 7.50 | | | | | | | | |--- lead_time <= 108.50 | | | | | | | | | |--- weights: [12.67, 13.66] class: 1 | | | | | | | | |--- lead_time > 108.50 | | | | | | | | | |--- weights: [12.67, 1.52] class: 0 | | | | | | | |--- arrival_month > 7.50 | | | | | | | | |--- lead_time <= 110.50 | | | | | | | | | |--- weights: [4.47, 1.52] class: 0 | | | | | | | | |--- lead_time > 110.50 | | | | | | | | | |--- weights: [6.71, 28.84] class: 1 | | | | | | |--- arrival_date > 11.50 | | | | | | | |--- avg_price_per_room <= 102.09 | | | | | | | | |--- arrival_date <= 14.50 | | | | | | | | | |--- weights: [1.49, 0.00] class: 0 | | | | | | | | |--- arrival_date > 14.50 | | | | | | | | | |--- weights: [3.73, 144.22] class: 1 | | | | | | | |--- avg_price_per_room > 102.09 | | | | | | | | |--- avg_price_per_room <= 109.50 | | | | | | | | | |--- no_of_days_stayed <= 1.50 | | | | | | | | | | |--- weights: [0.75, 16.70] class: 1 | | | | | | | | | |--- no_of_days_stayed > 1.50 | | | | | | | | | | |--- weights: [33.55, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 109.50 | | | | | | | | | |--- avg_price_per_room <= 124.25 | | | | | | | | | | |--- weights: [2.98, 75.91] class: 1 | | | | | | | | | |--- avg_price_per_room > 124.25 | | | | | | | | | | |--- weights: [3.73, 3.04] class: 0 | | | | |--- lead_time > 117.50 | | | | | |--- no_of_people <= 1.50 | | | | | | |--- avg_price_per_room <= 122.00 | | | | | | | |--- weights: [104.38, 0.00] class: 0 | | | | | | |--- avg_price_per_room > 122.00 | | | | | | | |--- weights: [0.00, 3.04] class: 1 | | | | | |--- no_of_people > 1.50 | | | | | | |--- arrival_month <= 11.50 | | | | | | | |--- arrival_date <= 7.50 | | | | | | | | |--- avg_price_per_room <= 216.00 | | | | | | | | | |--- weights: [42.50, 3.04] class: 0 | | | | | | | | |--- avg_price_per_room > 216.00 | | | | | | | | | |--- weights: [0.00, 1.52] class: 1 | | | | | | | |--- arrival_date > 7.50 | | | | | | | | |--- arrival_date <= 24.50 | | | | | | | | | |--- no_of_days_stayed <= 3.50 | | | | | | | | | | |--- arrival_date <= 23.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- arrival_date > 23.50 | | | | | | | | | | | |--- weights: [0.75, 28.84] class: 1 | | | | | | | | | |--- no_of_days_stayed > 3.50 | | | | | | | | | | |--- avg_price_per_room <= 74.12 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- avg_price_per_room > 74.12 | | | | | | | | | | | |--- weights: [20.13, 0.00] class: 0 | | | | | | | | |--- arrival_date > 24.50 | | | | | | | | | |--- lead_time <= 150.50 | | | | | | | | | | |--- weights: [40.26, 4.55] class: 0 | | | | | | | | | |--- lead_time > 150.50 | | | | | | | | | | |--- weights: [0.00, 4.55] class: 1 | | | | | | |--- arrival_month > 11.50 | | | | | | | |--- weights: [48.46, 0.00] class: 0 | | |--- market_segment_type_Online > 0.50 | | | |--- lead_time <= 13.50 | | | | |--- avg_price_per_room <= 99.44 | | | | | |--- arrival_month <= 1.50 | | | | | | |--- weights: [92.45, 0.00] class: 0 | | | | | |--- arrival_month > 1.50 | | | | | | |--- arrival_month <= 8.50 | | | | | | | |--- no_of_days_stayed <= 2.50 | | | | | | | | |--- lead_time <= 5.50 | | | | | | | | | |--- no_of_people <= 1.50 | | | | | | | | | | |--- weights: [44.73, 1.52] class: 0 | | | | | | | | | |--- no_of_people > 1.50 | | | | | | | | | | |--- avg_price_per_room <= 74.40 | | | | | | | | | | | |--- weights: [18.64, 0.00] class: 0 | | | | | | | | | | |--- avg_price_per_room > 74.40 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | |--- lead_time > 5.50 | | | | | | | | | |--- arrival_date <= 3.50 | | | | | | | | | | |--- weights: [7.46, 0.00] class: 0 | | | | | | | | | |--- arrival_date > 3.50 | | | | | | | | | | |--- avg_price_per_room <= 68.38 | | | | | | | | | | | |--- weights: [4.47, 0.00] class: 0 | | | | | | | | | | |--- avg_price_per_room > 68.38 | | | | | | | | | | | |--- weights: [25.35, 33.40] class: 1 | | | | | | | |--- no_of_days_stayed > 2.50 | | | | | | | | |--- no_of_people <= 1.50 | | | | | | | | | |--- avg_price_per_room <= 85.50 | | | | | | | | | | |--- weights: [0.00, 22.77] class: 1 | | | | | | | | | |--- avg_price_per_room > 85.50 | | | | | | | | | | |--- weights: [2.98, 0.00] class: 0 | | | | | | | | |--- no_of_people > 1.50 | | | | | | | | | |--- lead_time <= 2.50 | | | | | | | | | | |--- arrival_date <= 20.50 | | | | | | | | | | | |--- weights: [13.42, 0.00] class: 0 | | | | | | | | | | |--- arrival_date > 20.50 | | | | | | | | | | | |--- weights: [2.98, 4.55] class: 1 | | | | | | | | | |--- lead_time > 2.50 | | | | | | | | | | |--- arrival_date <= 13.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- arrival_date > 13.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | |--- arrival_month > 8.50 | | | | | | | |--- no_of_days_stayed <= 5.50 | | | | | | | | |--- weights: [159.55, 7.59] class: 0 | | | | | | | |--- no_of_days_stayed > 5.50 | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | |--- weights: [0.75, 9.11] class: 1 | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | |--- weights: [5.96, 0.00] class: 0 | | | | |--- avg_price_per_room > 99.44 | | | | | |--- lead_time <= 3.50 | | | | | | |--- avg_price_per_room <= 202.67 | | | | | | | |--- no_of_days_stayed <= 6.50 | | | | | | | | |--- arrival_month <= 5.50 | | | | | | | | | |--- weights: [63.37, 30.36] class: 0 | | | | | | | | |--- arrival_month > 5.50 | | | | | | | | | |--- weights: [155.82, 25.81] class: 0 | | | | | | | |--- no_of_days_stayed > 6.50 | | | | | | | | |--- weights: [0.00, 6.07] class: 1 | | | | | | |--- avg_price_per_room > 202.67 | | | | | | | |--- weights: [0.75, 22.77] class: 1 | | | | | |--- lead_time > 3.50 | | | | | | |--- arrival_month <= 8.50 | | | | | | | |--- weights: [61.14, 232.27] class: 1 | | | | | | |--- arrival_month > 8.50 | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | |--- weights: [26.09, 1.52] class: 0 | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | |--- arrival_date <= 14.00 | | | | | | | | | | |--- arrival_date <= 1.50 | | | | | | | | | | | |--- weights: [2.24, 0.00] class: 0 | | | | | | | | | | |--- arrival_date > 1.50 | | | | | | | | | | | |--- weights: [7.46, 36.43] class: 1 | | | | | | | | | |--- arrival_date > 14.00 | | | | | | | | | | |--- avg_price_per_room <= 208.67 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- avg_price_per_room > 208.67 | | | | | | | | | | | |--- weights: [0.00, 4.55] class: 1 | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | |--- weights: [15.66, 0.00] class: 0 | | | |--- lead_time > 13.50 | | | | |--- required_car_parking_space <= 0.50 | | | | | |--- avg_price_per_room <= 71.92 | | | | | | |--- avg_price_per_room <= 59.43 | | | | | | | |--- lead_time <= 84.50 | | | | | | | | |--- weights: [50.70, 7.59] class: 0 | | | | | | | |--- lead_time > 84.50 | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | |--- arrival_date <= 27.00 | | | | | | | | | | |--- lead_time <= 131.50 | | | | | | | | | | | |--- weights: [0.75, 15.18] class: 1 | | | | | | | | | | |--- lead_time > 131.50 | | | | | | | | | | | |--- weights: [2.24, 0.00] class: 0 | | | | | | | | | |--- arrival_date > 27.00 | | | | | | | | | | |--- weights: [3.73, 0.00] class: 0 | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | |--- weights: [10.44, 0.00] class: 0 | | | | | | |--- avg_price_per_room > 59.43 | | | | | | | |--- lead_time <= 25.50 | | | | | | | | |--- weights: [20.88, 6.07] class: 0 | | | | | | | |--- lead_time > 25.50 | | | | | | | | |--- avg_price_per_room <= 71.34 | | | | | | | | | |--- arrival_month <= 3.50 | | | | | | | | | | |--- weights: [27.59, 97.16] class: 1 | | | | | | | | | |--- arrival_month > 3.50 | | | | | | | | | | |--- lead_time <= 102.00 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- lead_time > 102.00 | | | | | | | | | | | |--- weights: [12.67, 3.04] class: 0 | | | | | | | | |--- avg_price_per_room > 71.34 | | | | | | | | | |--- weights: [11.18, 0.00] class: 0 | | | | | |--- avg_price_per_room > 71.92 | | | | | | |--- arrival_year <= 2017.50 | | | | | | | |--- lead_time <= 65.50 | | | | | | | | |--- avg_price_per_room <= 120.45 | | | | | | | | | |--- weights: [79.77, 9.11] class: 0 | | | | | | | | |--- avg_price_per_room > 120.45 | | | | | | | | | |--- no_of_people <= 2.50 | | | | | | | | | | |--- weights: [4.47, 12.14] class: 1 | | | | | | | | | |--- no_of_people > 2.50 | | | | | | | | | | |--- weights: [2.98, 0.00] class: 0 | | | | | | | |--- lead_time > 65.50 | | | | | | | | |--- type_of_meal_plan_Meal Plan 2 <= 0.50 | | | | | | | | | |--- arrival_date <= 27.50 | | | | | | | | | | |--- weights: [16.40, 47.06] class: 1 | | | | | | | | | |--- arrival_date > 27.50 | | | | | | | | | | |--- weights: [3.73, 0.00] class: 0 | | | | | | | | |--- type_of_meal_plan_Meal Plan 2 > 0.50 | | | | | | | | | |--- weights: [0.00, 63.76] class: 1 | | | | | | |--- arrival_year > 2017.50 | | | | | | | |--- avg_price_per_room <= 104.31 | | | | | | | | |--- lead_time <= 25.50 | | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | | |--- arrival_month <= 1.50 | | | | | | | | | | | |--- weights: [16.40, 0.00] class: 0 | | | | | | | | | | |--- arrival_month > 1.50 | | | | | | | | | | | |--- weights: [38.77, 118.41] class: 1 | | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | | |--- weights: [23.11, 0.00] class: 0 | | | | | | | | |--- lead_time > 25.50 | | | | | | | | | |--- type_of_meal_plan_Not Selected <= 0.50 | | | | | | | | | | |--- arrival_month <= 5.50 | | | | | | | | | | | |--- truncated branch of depth 7 | | | | | | | | | | |--- arrival_month > 5.50 | | | | | | | | | | | |--- weights: [61.88, 242.90] class: 1 | | | | | | | | | |--- type_of_meal_plan_Not Selected > 0.50 | | | | | | | | | | |--- weights: [73.81, 411.41] class: 1 | | | | | | | |--- avg_price_per_room > 104.31 | | | | | | | | |--- arrival_month <= 10.50 | | | | | | | | | |--- room_type_reserved_Room_Type 5 <= 0.50 | | | | | | | | | | |--- avg_price_per_room <= 195.30 | | | | | | | | | | | |--- weights: [325.81, 2031.24] class: 1 | | | | | | | | | | |--- avg_price_per_room > 195.30 | | | | | | | | | | | |--- weights: [0.75, 138.15] class: 1 | | | | | | | | | |--- room_type_reserved_Room_Type 5 > 0.50 | | | | | | | | | | |--- arrival_date <= 22.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- arrival_date > 22.50 | | | | | | | | | | | |--- weights: [0.75, 9.11] class: 1 | | | | | | | | |--- arrival_month > 10.50 | | | | | | | | | |--- no_of_people <= 2.50 | | | | | | | | | | |--- avg_price_per_room <= 162.82 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- avg_price_per_room > 162.82 | | | | | | | | | | | |--- weights: [9.69, 1.52] class: 0 | | | | | | | | | |--- no_of_people > 2.50 | | | | | | | | | | |--- lead_time <= 22.00 | | | | | | | | | | | |--- weights: [2.24, 0.00] class: 0 | | | | | | | | | | |--- lead_time > 22.00 | | | | | | | | | | | |--- weights: [6.71, 60.72] class: 1 | | | | |--- required_car_parking_space > 0.50 | | | | | |--- no_of_days_stayed <= 11.00 | | | | | | |--- weights: [48.46, 0.00] class: 0 | | | | | |--- no_of_days_stayed > 11.00 | | | | | | |--- weights: [0.00, 1.52] class: 1 | |--- no_of_special_requests > 0.50 | | |--- no_of_special_requests <= 1.50 | | | |--- market_segment_type_Online <= 0.50 | | | | |--- lead_time <= 102.50 | | | | | |--- type_of_meal_plan_Not Selected <= 0.50 | | | | | | |--- no_of_days_stayed <= 15.00 | | | | | | | |--- weights: [697.09, 7.59] class: 0 | | | | | | |--- no_of_days_stayed > 15.00 | | | | | | | |--- weights: [0.00, 1.52] class: 1 | | | | | |--- type_of_meal_plan_Not Selected > 0.50 | | | | | | |--- lead_time <= 63.00 | | | | | | | |--- weights: [15.66, 1.52] class: 0 | | | | | | |--- lead_time > 63.00 | | | | | | | |--- weights: [0.00, 7.59] class: 1 | | | | |--- lead_time > 102.50 | | | | | |--- lead_time <= 104.50 | | | | | | |--- weights: [3.73, 6.07] class: 1 | | | | | |--- lead_time > 104.50 | | | | | | |--- lead_time <= 150.50 | | | | | | | |--- avg_price_per_room <= 141.75 | | | | | | | | |--- weights: [71.57, 10.63] class: 0 | | | | | | | |--- avg_price_per_room > 141.75 | | | | | | | | |--- weights: [0.75, 3.04] class: 1 | | | | | | |--- lead_time > 150.50 | | | | | | | |--- weights: [0.75, 3.04] class: 1 | | | |--- market_segment_type_Online > 0.50 | | | | |--- lead_time <= 8.50 | | | | | |--- lead_time <= 4.50 | | | | | | |--- no_of_days_stayed <= 14.00 | | | | | | | |--- weights: [498.03, 40.99] class: 0 | | | | | | |--- no_of_days_stayed > 14.00 | | | | | | | |--- weights: [0.00, 3.04] class: 1 | | | | | |--- lead_time > 4.50 | | | | | | |--- arrival_date <= 13.50 | | | | | | | |--- arrival_month <= 9.50 | | | | | | | | |--- room_type_reserved_Room_Type 2 <= 0.50 | | | | | | | | | |--- avg_price_per_room <= 88.39 | | | | | | | | | | |--- weights: [17.15, 1.52] class: 0 | | | | | | | | | |--- avg_price_per_room > 88.39 | | | | | | | | | | |--- weights: [41.01, 30.36] class: 0 | | | | | | | | |--- room_type_reserved_Room_Type 2 > 0.50 | | | | | | | | | |--- weights: [0.75, 4.55] class: 1 | | | | | | | |--- arrival_month > 9.50 | | | | | | | | |--- weights: [33.55, 1.52] class: 0 | | | | | | |--- arrival_date > 13.50 | | | | | | | |--- type_of_meal_plan_Not Selected <= 0.50 | | | | | | | | |--- weights: [123.76, 9.11] class: 0 | | | | | | | |--- type_of_meal_plan_Not Selected > 0.50 | | | | | | | | |--- avg_price_per_room <= 126.33 | | | | | | | | | |--- weights: [32.80, 3.04] class: 0 | | | | | | | | |--- avg_price_per_room > 126.33 | | | | | | | | | |--- weights: [9.69, 13.66] class: 1 | | | | |--- lead_time > 8.50 | | | | | |--- required_car_parking_space <= 0.50 | | | | | | |--- avg_price_per_room <= 118.55 | | | | | | | |--- lead_time <= 61.50 | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | |--- no_of_days_stayed <= 6.50 | | | | | | | | | | |--- arrival_month <= 1.50 | | | | | | | | | | | |--- weights: [65.61, 0.00] class: 0 | | | | | | | | | | |--- arrival_month > 1.50 | | | | | | | | | | | |--- truncated branch of depth 11 | | | | | | | | | |--- no_of_days_stayed > 6.50 | | | | | | | | | | |--- arrival_month <= 1.50 | | | | | | | | | | | |--- weights: [4.47, 0.00] class: 0 | | | | | | | | | | |--- arrival_month > 1.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | |--- no_of_days_stayed <= 12.50 | | | | | | | | | | |--- weights: [126.74, 0.00] class: 0 | | | | | | | | | |--- no_of_days_stayed > 12.50 | | | | | | | | | | |--- weights: [0.00, 1.52] class: 1 | | | | | | | |--- lead_time > 61.50 | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | |--- arrival_month <= 7.50 | | | | | | | | | | |--- no_of_people <= 2.50 | | | | | | | | | | | |--- weights: [2.98, 57.69] class: 1 | | | | | | | | | | |--- no_of_people > 2.50 | | | | | | | | | | | |--- weights: [1.49, 0.00] class: 0 | | | | | | | | | |--- arrival_month > 7.50 | | | | | | | | | | |--- lead_time <= 66.50 | | | | | | | | | | | |--- weights: [5.22, 0.00] class: 0 | | | | | | | | | | |--- lead_time > 66.50 | | | | | | | | | | | |--- weights: [37.28, 54.65] class: 1 | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | |--- arrival_month <= 9.50 | | | | | | | | | | |--- avg_price_per_room <= 71.93 | | | | | | | | | | | |--- weights: [54.43, 3.04] class: 0 | | | | | | | | | | |--- avg_price_per_room > 71.93 | | | | | | | | | | | |--- truncated branch of depth 14 | | | | | | | | | |--- arrival_month > 9.50 | | | | | | | | | | |--- no_of_days_stayed <= 2.50 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | | |--- no_of_days_stayed > 2.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | |--- avg_price_per_room > 118.55 | | | | | | | |--- arrival_month <= 8.50 | | | | | | | | |--- arrival_date <= 19.50 | | | | | | | | | |--- no_of_people <= 2.50 | | | | | | | | | | |--- lead_time <= 146.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- lead_time > 146.50 | | | | | | | | | | | |--- weights: [0.00, 4.55] class: 1 | | | | | | | | | |--- no_of_people > 2.50 | | | | | | | | | | |--- no_of_days_stayed <= 2.50 | | | | | | | | | | | |--- weights: [48.46, 18.22] class: 0 | | | | | | | | | | |--- no_of_days_stayed > 2.50 | | | | | | | | | | | |--- weights: [44.73, 53.13] class: 1 | | | | | | | | |--- arrival_date > 19.50 | | | | | | | | | |--- arrival_date <= 27.50 | | | | | | | | | | |--- avg_price_per_room <= 121.20 | | | | | | | | | | | |--- weights: [18.64, 6.07] class: 0 | | | | | | | | | | |--- avg_price_per_room > 121.20 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | |--- arrival_date > 27.50 | | | | | | | | | | |--- lead_time <= 55.50 | | | | | | | | | | | |--- weights: [43.99, 16.70] class: 0 | | | | | | | | | | |--- lead_time > 55.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | |--- arrival_month > 8.50 | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | |--- arrival_month <= 9.50 | | | | | | | | | | |--- weights: [11.93, 10.63] class: 0 | | | | | | | | | |--- arrival_month > 9.50 | | | | | | | | | | |--- weights: [37.28, 0.00] class: 0 | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | | |--- weights: [244.54, 332.47] class: 1 | | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | | |--- lead_time <= 100.00 | | | | | | | | | | | |--- weights: [49.95, 0.00] class: 0 | | | | | | | | | | |--- lead_time > 100.00 | | | | | | | | | | | |--- weights: [0.75, 18.22] class: 1 | | | | | |--- required_car_parking_space > 0.50 | | | | | | |--- no_of_days_stayed <= 10.50 | | | | | | | |--- weights: [134.20, 0.00] class: 0 | | | | | | |--- no_of_days_stayed > 10.50 | | | | | | | |--- weights: [0.00, 1.52] class: 1 | | |--- no_of_special_requests > 1.50 | | | |--- lead_time <= 90.50 | | | | |--- no_of_days_stayed <= 4.50 | | | | | |--- weights: [1582.81, 13.66] class: 0 | | | | |--- no_of_days_stayed > 4.50 | | | | | |--- no_of_days_stayed <= 12.00 | | | | | | |--- no_of_special_requests <= 2.50 | | | | | | | |--- no_of_days_stayed <= 6.50 | | | | | | | | |--- weights: [143.15, 22.77] class: 0 | | | | | | | |--- no_of_days_stayed > 6.50 | | | | | | | | |--- weights: [40.26, 18.22] class: 0 | | | | | | |--- no_of_special_requests > 2.50 | | | | | | | |--- weights: [51.44, 0.00] class: 0 | | | | | |--- no_of_days_stayed > 12.00 | | | | | | |--- weights: [0.00, 3.04] class: 1 | | | |--- lead_time > 90.50 | | | | |--- no_of_special_requests <= 2.50 | | | | | |--- arrival_month <= 8.50 | | | | | | |--- avg_price_per_room <= 202.95 | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | |--- arrival_month <= 7.50 | | | | | | | | | |--- weights: [1.49, 9.11] class: 1 | | | | | | | | |--- arrival_month > 7.50 | | | | | | | | | |--- weights: [8.20, 3.04] class: 0 | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | |--- lead_time <= 150.50 | | | | | | | | | |--- no_of_days_stayed <= 5.50 | | | | | | | | | | |--- weights: [154.33, 16.70] class: 0 | | | | | | | | | |--- no_of_days_stayed > 5.50 | | | | | | | | | | |--- arrival_date <= 5.50 | | | | | | | | | | | |--- weights: [1.49, 6.07] class: 1 | | | | | | | | | | |--- arrival_date > 5.50 | | | | | | | | | | | |--- weights: [19.38, 6.07] class: 0 | | | | | | | | |--- lead_time > 150.50 | | | | | | | | | |--- weights: [0.00, 4.55] class: 1 | | | | | | |--- avg_price_per_room > 202.95 | | | | | | | |--- weights: [0.00, 10.63] class: 1 | | | | | |--- arrival_month > 8.50 | | | | | | |--- avg_price_per_room <= 153.15 | | | | | | | |--- room_type_reserved_Room_Type 2 <= 0.50 | | | | | | | | |--- weights: [87.98, 103.23] class: 1 | | | | | | | |--- room_type_reserved_Room_Type 2 > 0.50 | | | | | | | | |--- weights: [5.96, 0.00] class: 0 | | | | | | |--- avg_price_per_room > 153.15 | | | | | | | |--- weights: [12.67, 3.04] class: 0 | | | | |--- no_of_special_requests > 2.50 | | | | | |--- weights: [67.10, 0.00] class: 0 |--- lead_time > 151.50 | |--- avg_price_per_room <= 100.04 | | |--- no_of_special_requests <= 0.50 | | | |--- no_of_people <= 1.50 | | | | |--- market_segment_type_Online <= 0.50 | | | | | |--- lead_time <= 163.50 | | | | | | |--- lead_time <= 160.50 | | | | | | | |--- weights: [2.98, 0.00] class: 0 | | | | | | |--- lead_time > 160.50 | | | | | | | |--- weights: [0.75, 24.29] class: 1 | | | | | |--- lead_time > 163.50 | | | | | | |--- lead_time <= 341.00 | | | | | | | |--- lead_time <= 173.00 | | | | | | | | |--- arrival_date <= 3.50 | | | | | | | | | |--- weights: [46.97, 9.11] class: 0 | | | | | | | | |--- arrival_date > 3.50 | | | | | | | | | |--- no_of_days_stayed <= 3.00 | | | | | | | | | | |--- weights: [0.00, 13.66] class: 1 | | | | | | | | | |--- no_of_days_stayed > 3.00 | | | | | | | | | | |--- weights: [2.24, 0.00] class: 0 | | | | | | | |--- lead_time > 173.00 | | | | | | | | |--- arrival_month <= 5.50 | | | | | | | | | |--- arrival_date <= 7.50 | | | | | | | | | | |--- weights: [0.00, 4.55] class: 1 | | | | | | | | | |--- arrival_date > 7.50 | | | | | | | | | | |--- weights: [6.71, 0.00] class: 0 | | | | | | | | |--- arrival_month > 5.50 | | | | | | | | | |--- weights: [187.88, 7.59] class: 0 | | | | | | |--- lead_time > 341.00 | | | | | | | |--- arrival_date <= 8.50 | | | | | | | | |--- weights: [0.75, 12.14] class: 1 | | | | | | | |--- arrival_date > 8.50 | | | | | | | | |--- weights: [12.67, 15.18] class: 1 | | | | |--- market_segment_type_Online > 0.50 | | | | | |--- avg_price_per_room <= 2.50 | | | | | | |--- lead_time <= 285.50 | | | | | | | |--- weights: [8.20, 0.00] class: 0 | | | | | | |--- lead_time > 285.50 | | | | | | | |--- weights: [0.75, 3.04] class: 1 | | | | | |--- avg_price_per_room > 2.50 | | | | | | |--- weights: [0.75, 80.46] class: 1 | | | |--- no_of_people > 1.50 | | | | |--- avg_price_per_room <= 82.47 | | | | | |--- market_segment_type_Offline <= 0.50 | | | | | | |--- weights: [2.98, 288.44] class: 1 | | | | | |--- market_segment_type_Offline > 0.50 | | | | | | |--- arrival_month <= 11.50 | | | | | | | |--- lead_time <= 244.00 | | | | | | | | |--- no_of_days_stayed <= 2.50 | | | | | | | | | |--- lead_time <= 166.50 | | | | | | | | | | |--- weights: [2.24, 0.00] class: 0 | | | | | | | | | |--- lead_time > 166.50 | | | | | | | | | | |--- arrival_date <= 19.00 | | | | | | | | | | | |--- weights: [0.75, 57.69] class: 1 | | | | | | | | | | |--- arrival_date > 19.00 | | | | | | | | | | | |--- weights: [1.49, 0.00] class: 0 | | | | | | | | |--- no_of_days_stayed > 2.50 | | | | | | | | | |--- arrival_date <= 11.50 | | | | | | | | | | |--- type_of_meal_plan_Meal Plan 2 <= 0.50 | | | | | | | | | | | |--- weights: [53.68, 3.04] class: 0 | | | | | | | | | | |--- type_of_meal_plan_Meal Plan 2 > 0.50 | | | | | | | | | | | |--- weights: [0.00, 1.52] class: 1 | | | | | | | | | |--- arrival_date > 11.50 | | | | | | | | | | |--- arrival_date <= 15.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- arrival_date > 15.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | |--- lead_time > 244.00 | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | |--- weights: [25.35, 0.00] class: 0 | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | |--- avg_price_per_room <= 80.38 | | | | | | | | | | |--- avg_price_per_room <= 76.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- avg_price_per_room > 76.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- avg_price_per_room > 80.38 | | | | | | | | | | |--- weights: [7.46, 0.00] class: 0 | | | | | | |--- arrival_month > 11.50 | | | | | | | |--- weights: [46.97, 0.00] class: 0 | | | | |--- avg_price_per_room > 82.47 | | | | | |--- lead_time <= 324.50 | | | | | | |--- arrival_month <= 11.50 | | | | | | | |--- weights: [14.91, 1008.03] class: 1 | | | | | | |--- arrival_month > 11.50 | | | | | | | |--- market_segment_type_Offline <= 0.50 | | | | | | | | |--- weights: [0.00, 19.74] class: 1 | | | | | | | |--- market_segment_type_Offline > 0.50 | | | | | | | | |--- weights: [7.46, 0.00] class: 0 | | | | | |--- lead_time > 324.50 | | | | | | |--- avg_price_per_room <= 89.00 | | | | | | | |--- weights: [5.96, 0.00] class: 0 | | | | | | |--- avg_price_per_room > 89.00 | | | | | | | |--- weights: [0.75, 13.66] class: 1 | | |--- no_of_special_requests > 0.50 | | | |--- market_segment_type_Offline <= 0.50 | | | | |--- lead_time <= 180.50 | | | | | |--- lead_time <= 159.50 | | | | | | |--- arrival_month <= 8.50 | | | | | | | |--- weights: [20.88, 9.11] class: 0 | | | | | | |--- arrival_month > 8.50 | | | | | | | |--- arrival_date <= 23.50 | | | | | | | | |--- weights: [3.73, 1.52] class: 0 | | | | | | | |--- arrival_date > 23.50 | | | | | | | | |--- weights: [0.00, 12.14] class: 1 | | | | | |--- lead_time > 159.50 | | | | | | |--- weights: [94.69, 21.25] class: 0 | | | | |--- lead_time > 180.50 | | | | | |--- no_of_days_stayed <= 3.50 | | | | | | |--- no_of_special_requests <= 2.50 | | | | | | | |--- avg_price_per_room <= 66.47 | | | | | | | | |--- weights: [2.24, 0.00] class: 0 | | | | | | | |--- avg_price_per_room > 66.47 | | | | | | | | |--- weights: [43.99, 186.73] class: 1 | | | | | | |--- no_of_special_requests > 2.50 | | | | | | | |--- weights: [8.20, 0.00] class: 0 | | | | | |--- no_of_days_stayed > 3.50 | | | | | | |--- arrival_year <= 2017.50 | | | | | | | |--- weights: [14.17, 0.00] class: 0 | | | | | | |--- arrival_year > 2017.50 | | | | | | | |--- no_of_days_stayed <= 11.50 | | | | | | | | |--- avg_price_per_room <= 69.40 | | | | | | | | | |--- weights: [14.17, 4.55] class: 0 | | | | | | | | |--- avg_price_per_room > 69.40 | | | | | | | | | |--- no_of_special_requests <= 2.50 | | | | | | | | | | |--- arrival_month <= 10.50 | | | | | | | | | | | |--- truncated branch of depth 6 | | | | | | | | | | |--- arrival_month > 10.50 | | | | | | | | | | | |--- weights: [10.44, 27.33] class: 1 | | | | | | | | | |--- no_of_special_requests > 2.50 | | | | | | | | | | |--- weights: [5.22, 0.00] class: 0 | | | | | | | |--- no_of_days_stayed > 11.50 | | | | | | | | |--- weights: [0.75, 10.63] class: 1 | | | |--- market_segment_type_Offline > 0.50 | | | | |--- lead_time <= 348.50 | | | | | |--- weights: [127.49, 7.59] class: 0 | | | | |--- lead_time > 348.50 | | | | | |--- weights: [5.96, 6.07] class: 1 | |--- avg_price_per_room > 100.04 | | |--- arrival_month <= 11.50 | | | |--- no_of_special_requests <= 2.50 | | | | |--- weights: [0.00, 3200.19] class: 1 | | | |--- no_of_special_requests > 2.50 | | | | |--- weights: [23.11, 0.00] class: 0 | | |--- arrival_month > 11.50 | | | |--- no_of_special_requests <= 0.50 | | | | |--- weights: [35.04, 0.00] class: 0 | | | |--- no_of_special_requests > 0.50 | | | | |--- arrival_date <= 24.50 | | | | | |--- weights: [3.73, 0.00] class: 0 | | | | |--- arrival_date > 24.50 | | | | | |--- weights: [3.73, 22.77] class: 1
# Visualizing feature importance
importances = estimator.feature_importances_
indices = np.argsort(importances)
plt.figure(figsize=(12, 12))
plt.title("Feature Importance (Pre Pruning)")
plt.barh(range(len(indices)), importances[indices], color="violet", align="center")
plt.yticks(range(len(indices)), [feature_names[i] for i in indices])
plt.xlabel("Relative Importance")
plt.show()
The DecisionTreeClassifier provides parameters such as min_samples_leaf and max_depth to prevent a tree from overfiting. Cost complexity pruning provides another option to control the size of a tree. In DecisionTreeClassifier, this pruning technique is parameterized by the cost complexity parameter, ccp_alpha. Greater values of ccp_alpha increase the number of nodes pruned.
Minimal cost complexity pruning recursively finds the node with the "weakest link". The weakest link is characterized by an effective alpha, where the nodes with the smallest effective alpha are pruned first. To get an idea of what values of ccp_alpha could be appropriate, scikit-learn provides DecisionTreeClassifier.cost_complexity_pruning_path that returns the effective alphas and the corresponding total leaf impurities at each step of the pruning process. As alpha increases, more of the tree is pruned, which increases the total impurity of its leaves.
clf = DecisionTreeClassifier(random_state=1, class_weight = "balanced")
path = clf.cost_complexity_pruning_path(X_train, y_train)
ccp_alphas, impurities = path.ccp_alphas, path.impurities
# Looking at alpha values and impurities
pd.DataFrame(path)
| ccp_alphas | impurities | |
|---|---|---|
| 0 | 0.000000e+00 | 0.008376 |
| 1 | 2.933821e-20 | 0.008376 |
| 2 | 2.933821e-20 | 0.008376 |
| 3 | 2.933821e-20 | 0.008376 |
| 4 | 2.933821e-20 | 0.008376 |
| ... | ... | ... |
| 1855 | 8.901596e-03 | 0.328058 |
| 1856 | 9.802243e-03 | 0.337860 |
| 1857 | 1.271875e-02 | 0.350579 |
| 1858 | 3.412090e-02 | 0.418821 |
| 1859 | 8.117914e-02 | 0.500000 |
1860 rows × 2 columns
# Generating a plot showing Total Impurity vs. Effective Alpha for the training set
fig, ax = plt.subplots(figsize=(15, 5))
ax.plot(ccp_alphas[:-1], impurities[:-1], marker="o", drawstyle="steps-post")
ax.set_xlabel("effective alpha")
ax.set_ylabel("total impurity of leaves")
ax.set_title("Total Impurity vs effective alpha for training set")
plt.show()
# Now going to train a decision tree using the effective alphas. The last value in ccp_alphas is the alpha value that prunes the whole tree, leaving the tree, clfs[-1], with one node.
clfs = []
for ccp_alpha in ccp_alphas:
clf = DecisionTreeClassifier(random_state=1, ccp_alpha=ccp_alpha, class_weight = "balanced")
clf.fit(X_train, y_train)
clfs.append(clf)
print(
"Number of nodes in the last tree is: {} with ccp_alpha: {}".format(
clfs[-1].tree_.node_count, ccp_alphas[-1]
)
)
Number of nodes in the last tree is: 1 with ccp_alpha: 0.08117914389137182
# For the remainder, removing the last element in clfs and ccp_alphas, because it is the trivial tree with only one node. Here we show that the number of nodes and tree depth decreases as alpha increases.
# Generating a plot of Number of nodes vs alpha and Depth vs alpha
clfs = clfs[:-1]
ccp_alphas = ccp_alphas[:-1]
node_counts = [clf.tree_.node_count for clf in clfs]
depth = [clf.tree_.max_depth for clf in clfs]
fig, ax = plt.subplots(2, 1, figsize=(10, 7))
ax[0].plot(ccp_alphas, node_counts, marker="o", drawstyle="steps-post")
ax[0].set_xlabel("alpha")
ax[0].set_ylabel("number of nodes")
ax[0].set_title("Number of nodes vs alpha")
ax[1].plot(ccp_alphas, depth, marker="o", drawstyle="steps-post")
ax[1].set_xlabel("alpha")
ax[1].set_ylabel("depth of tree")
ax[1].set_title("Depth vs alpha")
fig.tight_layout()
# Making a list of recall values from training set
recall_train = []
for clf in clfs:
pred_train = clf.predict(X_train)
values_train = recall_score(y_train, pred_train)
recall_train.append(values_train)
# Making a list of recall values from test set
recall_test = []
for clf in clfs:
pred_test = clf.predict(X_test)
values_test = recall_score(y_test, pred_test)
recall_test.append(values_test)
# Caclulating train and test scores
train_scores = [clf.score(X_train, y_train) for clf in clfs]
test_scores = [clf.score(X_test, y_test) for clf in clfs]
# Comparing Recall and Alpha for training and test sets
fig, ax = plt.subplots(figsize=(15, 5))
ax.set_xlabel("alpha")
ax.set_ylabel("Recall")
ax.set_title("Recall vs alpha for training and testing sets")
ax.plot(ccp_alphas, recall_train, marker="o", label="train", drawstyle="steps-post")
ax.plot(ccp_alphas, recall_test, marker="o", label="test", drawstyle="steps-post")
ax.legend()
plt.show()
# Creating the model where we get highest train and test recall
index_best_model = np.argmax(recall_test)
best_model = clfs[index_best_model]
print(best_model)
DecisionTreeClassifier(ccp_alpha=0.00015477722021374038,
class_weight='balanced', random_state=1)
best_model.fit(X_train, y_train)
DecisionTreeClassifier(ccp_alpha=0.00015477722021374038,
class_weight='balanced', random_state=1)
# Checking model performance on training set
confusion_matrix_sklearn(best_model, X_train, y_train)
# Checking model performance on training set
decision_tree_postpruned_perf_train = model_performance_classification_sklearn(
best_model, X_train, y_train
)
decision_tree_postpruned_perf_train
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.875158 | 0.895253 | 0.765464 | 0.825287 |
# Checking model performance on test set
confusion_matrix_sklearn(best_model, X_test, y_test)
# Checking model performance on test set
decision_tree_postpruned_perf_test = model_performance_classification_sklearn(
best_model, X_test, y_test
)
decision_tree_postpruned_perf_test
| Accuracy | Recall | Precision | F1 | |
|---|---|---|---|---|
| 0 | 0.852706 | 0.866837 | 0.729162 | 0.792061 |
# Visualizing pruned tree
plt.figure(figsize=(40, 50))
out = tree.plot_tree(
best_model,
feature_names=feature_names,
filled=True,
fontsize=9,
node_ids=True,
class_names=True,
)
for o in out:
arrow = o.arrow_patch
if arrow is not None:
arrow.set_edgecolor("black")
arrow.set_linewidth(1)
plt.show()
plt.show()
# Text report showing the rules of a pruned decision tree
print(tree.export_text(best_model, feature_names=feature_names, show_weights=True))
|--- lead_time <= 151.50 | |--- no_of_special_requests <= 0.50 | | |--- market_segment_type_Online <= 0.50 | | | |--- lead_time <= 90.50 | | | | |--- no_of_days_stayed <= 5.50 | | | | | |--- avg_price_per_room <= 201.50 | | | | | | |--- lead_time <= 74.50 | | | | | | | |--- arrival_month <= 5.50 | | | | | | | | |--- arrival_date <= 27.50 | | | | | | | | | |--- lead_time <= 59.50 | | | | | | | | | | |--- market_segment_type_Offline <= 0.50 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | | |--- market_segment_type_Offline > 0.50 | | | | | | | | | | | |--- weights: [383.21, 16.70] class: 0 | | | | | | | | | |--- lead_time > 59.50 | | | | | | | | | | |--- arrival_date <= 16.50 | | | | | | | | | | | |--- weights: [19.38, 0.00] class: 0 | | | | | | | | | | |--- arrival_date > 16.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | |--- arrival_date > 27.50 | | | | | | | | | |--- avg_price_per_room <= 61.00 | | | | | | | | | | |--- avg_price_per_room <= 59.75 | | | | | | | | | | | |--- weights: [8.95, 1.52] class: 0 | | | | | | | | | | |--- avg_price_per_room > 59.75 | | | | | | | | | | | |--- weights: [0.00, 50.10] class: 1 | | | | | | | | | |--- avg_price_per_room > 61.00 | | | | | | | | | | |--- weights: [44.73, 6.07] class: 0 | | | | | | | |--- arrival_month > 5.50 | | | | | | | | |--- market_segment_type_Offline <= 0.50 | | | | | | | | | |--- repeated_guest <= 0.50 | | | | | | | | | | |--- room_type_reserved_Room_Type 4 <= 0.50 | | | | | | | | | | | |--- weights: [343.70, 69.83] class: 0 | | | | | | | | | | |--- room_type_reserved_Room_Type 4 > 0.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | |--- repeated_guest > 0.50 | | | | | | | | | | |--- weights: [132.71, 0.00] class: 0 | | | | | | | | |--- market_segment_type_Offline > 0.50 | | | | | | | | | |--- avg_price_per_room <= 50.00 | | | | | | | | | | |--- arrival_month <= 9.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- arrival_month > 9.50 | | | | | | | | | | | |--- weights: [14.17, 0.00] class: 0 | | | | | | | | | |--- avg_price_per_room > 50.00 | | | | | | | | | | |--- arrival_date <= 1.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- arrival_date > 1.50 | | | | | | | | | | | |--- weights: [1285.33, 45.54] class: 0 | | | | | | |--- lead_time > 74.50 | | | | | | | |--- lead_time <= 78.50 | | | | | | | | |--- avg_price_per_room <= 79.78 | | | | | | | | | |--- weights: [14.91, 1.52] class: 0 | | | | | | | | |--- avg_price_per_room > 79.78 | | | | | | | | | |--- arrival_month <= 3.50 | | | | | | | | | | |--- weights: [0.00, 28.84] class: 1 | | | | | | | | | |--- arrival_month > 3.50 | | | | | | | | | | |--- no_of_days_stayed <= 2.50 | | | | | | | | | | | |--- weights: [2.24, 28.84] class: 1 | | | | | | | | | | |--- no_of_days_stayed > 2.50 | | | | | | | | | | | |--- weights: [8.20, 0.00] class: 0 | | | | | | | |--- lead_time > 78.50 | | | | | | | | |--- no_of_days_stayed <= 3.50 | | | | | | | | | |--- weights: [152.84, 19.74] class: 0 | | | | | | | | |--- no_of_days_stayed > 3.50 | | | | | | | | | |--- arrival_date <= 24.50 | | | | | | | | | | |--- weights: [17.15, 6.07] class: 0 | | | | | | | | | |--- arrival_date > 24.50 | | | | | | | | | | |--- weights: [2.24, 12.14] class: 1 | | | | | |--- avg_price_per_room > 201.50 | | | | | | |--- weights: [1.49, 25.81] class: 1 | | | | |--- no_of_days_stayed > 5.50 | | | | | |--- avg_price_per_room <= 92.80 | | | | | | |--- weights: [68.59, 12.14] class: 0 | | | | | |--- avg_price_per_room > 92.80 | | | | | | |--- arrival_month <= 8.50 | | | | | | | |--- arrival_date <= 21.00 | | | | | | | | |--- weights: [4.47, 85.01] class: 1 | | | | | | | |--- arrival_date > 21.00 | | | | | | | | |--- weights: [5.22, 0.00] class: 0 | | | | | | |--- arrival_month > 8.50 | | | | | | | |--- weights: [7.46, 0.00] class: 0 | | | |--- lead_time > 90.50 | | | | |--- lead_time <= 117.50 | | | | | |--- avg_price_per_room <= 93.58 | | | | | | |--- avg_price_per_room <= 75.07 | | | | | | | |--- arrival_month <= 7.50 | | | | | | | | |--- avg_price_per_room <= 58.75 | | | | | | | | | |--- weights: [10.44, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 58.75 | | | | | | | | | |--- no_of_days_stayed <= 3.50 | | | | | | | | | | |--- lead_time <= 104.50 | | | | | | | | | | | |--- weights: [5.22, 6.07] class: 1 | | | | | | | | | | |--- lead_time > 104.50 | | | | | | | | | | | |--- weights: [2.98, 126.00] class: 1 | | | | | | | | | |--- no_of_days_stayed > 3.50 | | | | | | | | | | |--- arrival_date <= 23.50 | | | | | | | | | | | |--- weights: [9.69, 1.52] class: 0 | | | | | | | | | | |--- arrival_date > 23.50 | | | | | | | | | | | |--- weights: [2.98, 15.18] class: 1 | | | | | | | |--- arrival_month > 7.50 | | | | | | | | |--- arrival_date <= 29.50 | | | | | | | | | |--- weights: [49.21, 7.59] class: 0 | | | | | | | | |--- arrival_date > 29.50 | | | | | | | | | |--- lead_time <= 98.00 | | | | | | | | | | |--- weights: [4.47, 0.00] class: 0 | | | | | | | | | |--- lead_time > 98.00 | | | | | | | | | | |--- weights: [0.75, 13.66] class: 1 | | | | | | |--- avg_price_per_room > 75.07 | | | | | | | |--- arrival_month <= 3.50 | | | | | | | | |--- weights: [59.64, 3.04] class: 0 | | | | | | | |--- arrival_month > 3.50 | | | | | | | | |--- arrival_month <= 4.50 | | | | | | | | | |--- weights: [1.49, 16.70] class: 1 | | | | | | | | |--- arrival_month > 4.50 | | | | | | | | | |--- no_of_people <= 1.50 | | | | | | | | | | |--- avg_price_per_room <= 86.00 | | | | | | | | | | | |--- weights: [2.24, 16.70] class: 1 | | | | | | | | | | |--- avg_price_per_room > 86.00 | | | | | | | | | | | |--- weights: [8.95, 3.04] class: 0 | | | | | | | | | |--- no_of_people > 1.50 | | | | | | | | | | |--- arrival_date <= 22.50 | | | | | | | | | | | |--- weights: [44.73, 4.55] class: 0 | | | | | | | | | | |--- arrival_date > 22.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | |--- avg_price_per_room > 93.58 | | | | | | |--- arrival_date <= 11.50 | | | | | | | |--- arrival_month <= 7.50 | | | | | | | | |--- weights: [25.35, 15.18] class: 0 | | | | | | | |--- arrival_month > 7.50 | | | | | | | | |--- weights: [11.18, 30.36] class: 1 | | | | | | |--- arrival_date > 11.50 | | | | | | | |--- avg_price_per_room <= 102.09 | | | | | | | | |--- weights: [5.22, 144.22] class: 1 | | | | | | | |--- avg_price_per_room > 102.09 | | | | | | | | |--- avg_price_per_room <= 109.50 | | | | | | | | | |--- no_of_days_stayed <= 1.50 | | | | | | | | | | |--- weights: [0.75, 16.70] class: 1 | | | | | | | | | |--- no_of_days_stayed > 1.50 | | | | | | | | | | |--- weights: [33.55, 0.00] class: 0 | | | | | | | | |--- avg_price_per_room > 109.50 | | | | | | | | | |--- weights: [6.71, 78.94] class: 1 | | | | |--- lead_time > 117.50 | | | | | |--- no_of_people <= 1.50 | | | | | | |--- avg_price_per_room <= 122.00 | | | | | | | |--- weights: [104.38, 0.00] class: 0 | | | | | | |--- avg_price_per_room > 122.00 | | | | | | | |--- weights: [0.00, 3.04] class: 1 | | | | | |--- no_of_people > 1.50 | | | | | | |--- arrival_month <= 11.50 | | | | | | | |--- arrival_date <= 7.50 | | | | | | | | |--- weights: [42.50, 4.55] class: 0 | | | | | | | |--- arrival_date > 7.50 | | | | | | | | |--- arrival_date <= 24.50 | | | | | | | | | |--- no_of_days_stayed <= 3.50 | | | | | | | | | | |--- arrival_date <= 23.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- arrival_date > 23.50 | | | | | | | | | | | |--- weights: [0.75, 28.84] class: 1 | | | | | | | | | |--- no_of_days_stayed > 3.50 | | | | | | | | | | |--- avg_price_per_room <= 74.12 | | | | | | | | | | | |--- weights: [5.96, 9.11] class: 1 | | | | | | | | | | |--- avg_price_per_room > 74.12 | | | | | | | | | | | |--- weights: [20.13, 0.00] class: 0 | | | | | | | | |--- arrival_date > 24.50 | | | | | | | | | |--- avg_price_per_room <= 173.26 | | | | | | | | | | |--- weights: [40.26, 4.55] class: 0 | | | | | | | | | |--- avg_price_per_room > 173.26 | | | | | | | | | | |--- weights: [0.00, 4.55] class: 1 | | | | | | |--- arrival_month > 11.50 | | | | | | | |--- weights: [48.46, 0.00] class: 0 | | |--- market_segment_type_Online > 0.50 | | | |--- lead_time <= 13.50 | | | | |--- avg_price_per_room <= 99.44 | | | | | |--- arrival_month <= 1.50 | | | | | | |--- weights: [92.45, 0.00] class: 0 | | | | | |--- arrival_month > 1.50 | | | | | | |--- arrival_month <= 8.50 | | | | | | | |--- no_of_days_stayed <= 2.50 | | | | | | | | |--- lead_time <= 5.50 | | | | | | | | | |--- no_of_people <= 1.50 | | | | | | | | | | |--- weights: [44.73, 1.52] class: 0 | | | | | | | | | |--- no_of_people > 1.50 | | | | | | | | | | |--- avg_price_per_room <= 74.40 | | | | | | | | | | | |--- weights: [18.64, 0.00] class: 0 | | | | | | | | | | |--- avg_price_per_room > 74.40 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | |--- lead_time > 5.50 | | | | | | | | | |--- weights: [37.28, 33.40] class: 0 | | | | | | | |--- no_of_days_stayed > 2.50 | | | | | | | | |--- no_of_people <= 1.50 | | | | | | | | | |--- avg_price_per_room <= 85.50 | | | | | | | | | | |--- weights: [0.00, 22.77] class: 1 | | | | | | | | | |--- avg_price_per_room > 85.50 | | | | | | | | | | |--- weights: [2.98, 0.00] class: 0 | | | | | | | | |--- no_of_people > 1.50 | | | | | | | | | |--- weights: [32.80, 24.29] class: 0 | | | | | | |--- arrival_month > 8.50 | | | | | | | |--- no_of_days_stayed <= 5.50 | | | | | | | | |--- weights: [159.55, 7.59] class: 0 | | | | | | | |--- no_of_days_stayed > 5.50 | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | |--- weights: [0.75, 9.11] class: 1 | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | |--- weights: [5.96, 0.00] class: 0 | | | | |--- avg_price_per_room > 99.44 | | | | | |--- lead_time <= 3.50 | | | | | | |--- avg_price_per_room <= 202.67 | | | | | | | |--- no_of_days_stayed <= 6.50 | | | | | | | | |--- arrival_month <= 5.50 | | | | | | | | | |--- weights: [63.37, 30.36] class: 0 | | | | | | | | |--- arrival_month > 5.50 | | | | | | | | | |--- weights: [155.82, 25.81] class: 0 | | | | | | | |--- no_of_days_stayed > 6.50 | | | | | | | | |--- weights: [0.00, 6.07] class: 1 | | | | | | |--- avg_price_per_room > 202.67 | | | | | | | |--- weights: [0.75, 22.77] class: 1 | | | | | |--- lead_time > 3.50 | | | | | | |--- arrival_month <= 8.50 | | | | | | | |--- avg_price_per_room <= 119.25 | | | | | | | | |--- avg_price_per_room <= 118.50 | | | | | | | | | |--- weights: [18.64, 59.21] class: 1 | | | | | | | | |--- avg_price_per_room > 118.50 | | | | | | | | | |--- weights: [8.20, 1.52] class: 0 | | | | | | | |--- avg_price_per_room > 119.25 | | | | | | | | |--- weights: [34.30, 171.55] class: 1 | | | | | | |--- arrival_month > 8.50 | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | |--- weights: [26.09, 1.52] class: 0 | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | |--- arrival_date <= 14.00 | | | | | | | | | | |--- weights: [9.69, 36.43] class: 1 | | | | | | | | | |--- arrival_date > 14.00 | | | | | | | | | | |--- avg_price_per_room <= 208.67 | | | | | | | | | | | |--- weights: [20.13, 6.07] class: 0 | | | | | | | | | | |--- avg_price_per_room > 208.67 | | | | | | | | | | | |--- weights: [0.00, 4.55] class: 1 | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | |--- weights: [15.66, 0.00] class: 0 | | | |--- lead_time > 13.50 | | | | |--- required_car_parking_space <= 0.50 | | | | | |--- avg_price_per_room <= 71.92 | | | | | | |--- avg_price_per_room <= 59.43 | | | | | | | |--- lead_time <= 84.50 | | | | | | | | |--- weights: [50.70, 7.59] class: 0 | | | | | | | |--- lead_time > 84.50 | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | |--- arrival_date <= 27.00 | | | | | | | | | | |--- weights: [2.98, 15.18] class: 1 | | | | | | | | | |--- arrival_date > 27.00 | | | | | | | | | | |--- weights: [3.73, 0.00] class: 0 | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | |--- weights: [10.44, 0.00] class: 0 | | | | | | |--- avg_price_per_room > 59.43 | | | | | | | |--- lead_time <= 25.50 | | | | | | | | |--- weights: [20.88, 6.07] class: 0 | | | | | | | |--- lead_time > 25.50 | | | | | | | | |--- avg_price_per_room <= 71.34 | | | | | | | | | |--- arrival_month <= 3.50 | | | | | | | | | | |--- weights: [27.59, 97.16] class: 1 | | | | | | | | | |--- arrival_month > 3.50 | | | | | | | | | | |--- lead_time <= 102.00 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | | | | | |--- lead_time > 102.00 | | | | | | | | | | | |--- weights: [12.67, 3.04] class: 0 | | | | | | | | |--- avg_price_per_room > 71.34 | | | | | | | | | |--- weights: [11.18, 0.00] class: 0 | | | | | |--- avg_price_per_room > 71.92 | | | | | | |--- arrival_year <= 2017.50 | | | | | | | |--- lead_time <= 65.50 | | | | | | | | |--- avg_price_per_room <= 120.45 | | | | | | | | | |--- weights: [79.77, 9.11] class: 0 | | | | | | | | |--- avg_price_per_room > 120.45 | | | | | | | | | |--- weights: [7.46, 12.14] class: 1 | | | | | | | |--- lead_time > 65.50 | | | | | | | | |--- type_of_meal_plan_Meal Plan 2 <= 0.50 | | | | | | | | | |--- weights: [20.13, 47.06] class: 1 | | | | | | | | |--- type_of_meal_plan_Meal Plan 2 > 0.50 | | | | | | | | | |--- weights: [0.00, 63.76] class: 1 | | | | | | |--- arrival_year > 2017.50 | | | | | | | |--- avg_price_per_room <= 104.31 | | | | | | | | |--- lead_time <= 25.50 | | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | | |--- arrival_month <= 1.50 | | | | | | | | | | | |--- weights: [16.40, 0.00] class: 0 | | | | | | | | | | |--- arrival_month > 1.50 | | | | | | | | | | | |--- weights: [38.77, 118.41] class: 1 | | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | | |--- weights: [23.11, 0.00] class: 0 | | | | | | | | |--- lead_time > 25.50 | | | | | | | | | |--- type_of_meal_plan_Not Selected <= 0.50 | | | | | | | | | | |--- arrival_month <= 5.50 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- arrival_month > 5.50 | | | | | | | | | | | |--- weights: [61.88, 242.90] class: 1 | | | | | | | | | |--- type_of_meal_plan_Not Selected > 0.50 | | | | | | | | | | |--- weights: [73.81, 411.41] class: 1 | | | | | | | |--- avg_price_per_room > 104.31 | | | | | | | | |--- arrival_month <= 10.50 | | | | | | | | | |--- room_type_reserved_Room_Type 5 <= 0.50 | | | | | | | | | | |--- avg_price_per_room <= 195.30 | | | | | | | | | | | |--- truncated branch of depth 4 | | | | | | | | | | |--- avg_price_per_room > 195.30 | | | | | | | | | | | |--- weights: [0.75, 138.15] class: 1 | | | | | | | | | |--- room_type_reserved_Room_Type 5 > 0.50 | | | | | | | | | | |--- arrival_date <= 22.50 | | | | | | | | | | | |--- weights: [11.18, 6.07] class: 0 | | | | | | | | | | |--- arrival_date > 22.50 | | | | | | | | | | | |--- weights: [0.75, 9.11] class: 1 | | | | | | | | |--- arrival_month > 10.50 | | | | | | | | | |--- no_of_people <= 2.50 | | | | | | | | | | |--- avg_price_per_room <= 162.82 | | | | | | | | | | | |--- weights: [21.62, 34.92] class: 1 | | | | | | | | | | |--- avg_price_per_room > 162.82 | | | | | | | | | | | |--- weights: [9.69, 1.52] class: 0 | | | | | | | | | |--- no_of_people > 2.50 | | | | | | | | | | |--- weights: [8.95, 60.72] class: 1 | | | | |--- required_car_parking_space > 0.50 | | | | | |--- weights: [48.46, 1.52] class: 0 | |--- no_of_special_requests > 0.50 | | |--- no_of_special_requests <= 1.50 | | | |--- market_segment_type_Online <= 0.50 | | | | |--- lead_time <= 102.50 | | | | | |--- type_of_meal_plan_Not Selected <= 0.50 | | | | | | |--- weights: [697.09, 9.11] class: 0 | | | | | |--- type_of_meal_plan_Not Selected > 0.50 | | | | | | |--- lead_time <= 63.00 | | | | | | | |--- weights: [15.66, 1.52] class: 0 | | | | | | |--- lead_time > 63.00 | | | | | | | |--- weights: [0.00, 7.59] class: 1 | | | | |--- lead_time > 102.50 | | | | | |--- weights: [76.79, 22.77] class: 0 | | | |--- market_segment_type_Online > 0.50 | | | | |--- lead_time <= 8.50 | | | | | |--- lead_time <= 4.50 | | | | | | |--- no_of_days_stayed <= 14.00 | | | | | | | |--- weights: [498.03, 40.99] class: 0 | | | | | | |--- no_of_days_stayed > 14.00 | | | | | | | |--- weights: [0.00, 3.04] class: 1 | | | | | |--- lead_time > 4.50 | | | | | | |--- arrival_date <= 13.50 | | | | | | | |--- arrival_month <= 9.50 | | | | | | | | |--- weights: [58.90, 36.43] class: 0 | | | | | | | |--- arrival_month > 9.50 | | | | | | | | |--- weights: [33.55, 1.52] class: 0 | | | | | | |--- arrival_date > 13.50 | | | | | | | |--- type_of_meal_plan_Not Selected <= 0.50 | | | | | | | | |--- weights: [123.76, 9.11] class: 0 | | | | | | | |--- type_of_meal_plan_Not Selected > 0.50 | | | | | | | | |--- avg_price_per_room <= 126.33 | | | | | | | | | |--- weights: [32.80, 3.04] class: 0 | | | | | | | | |--- avg_price_per_room > 126.33 | | | | | | | | | |--- weights: [9.69, 13.66] class: 1 | | | | |--- lead_time > 8.50 | | | | | |--- required_car_parking_space <= 0.50 | | | | | | |--- avg_price_per_room <= 118.55 | | | | | | | |--- lead_time <= 61.50 | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | |--- no_of_days_stayed <= 6.50 | | | | | | | | | | |--- arrival_month <= 1.50 | | | | | | | | | | | |--- weights: [65.61, 0.00] class: 0 | | | | | | | | | | |--- arrival_month > 1.50 | | | | | | | | | | | |--- truncated branch of depth 10 | | | | | | | | | |--- no_of_days_stayed > 6.50 | | | | | | | | | | |--- weights: [24.60, 39.47] class: 1 | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | |--- weights: [126.74, 1.52] class: 0 | | | | | | | |--- lead_time > 61.50 | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | |--- arrival_month <= 7.50 | | | | | | | | | | |--- weights: [4.47, 57.69] class: 1 | | | | | | | | | |--- arrival_month > 7.50 | | | | | | | | | | |--- weights: [42.50, 54.65] class: 1 | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | |--- arrival_month <= 9.50 | | | | | | | | | | |--- avg_price_per_room <= 71.93 | | | | | | | | | | | |--- weights: [54.43, 3.04] class: 0 | | | | | | | | | | |--- avg_price_per_room > 71.93 | | | | | | | | | | | |--- truncated branch of depth 10 | | | | | | | | | |--- arrival_month > 9.50 | | | | | | | | | | |--- no_of_days_stayed <= 2.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- no_of_days_stayed > 2.50 | | | | | | | | | | | |--- truncated branch of depth 3 | | | | | | |--- avg_price_per_room > 118.55 | | | | | | | |--- arrival_month <= 8.50 | | | | | | | | |--- arrival_date <= 19.50 | | | | | | | | | |--- no_of_people <= 2.50 | | | | | | | | | | |--- lead_time <= 146.50 | | | | | | | | | | | |--- weights: [210.25, 62.24] class: 0 | | | | | | | | | | |--- lead_time > 146.50 | | | | | | | | | | | |--- weights: [0.00, 4.55] class: 1 | | | | | | | | | |--- no_of_people > 2.50 | | | | | | | | | | |--- no_of_days_stayed <= 2.50 | | | | | | | | | | | |--- weights: [48.46, 18.22] class: 0 | | | | | | | | | | |--- no_of_days_stayed > 2.50 | | | | | | | | | | | |--- weights: [44.73, 53.13] class: 1 | | | | | | | | |--- arrival_date > 19.50 | | | | | | | | | |--- arrival_date <= 27.50 | | | | | | | | | | |--- avg_price_per_room <= 121.20 | | | | | | | | | | | |--- weights: [18.64, 6.07] class: 0 | | | | | | | | | | |--- avg_price_per_room > 121.20 | | | | | | | | | | | |--- weights: [78.28, 110.82] class: 1 | | | | | | | | | |--- arrival_date > 27.50 | | | | | | | | | | |--- weights: [67.10, 39.47] class: 0 | | | | | | | |--- arrival_month > 8.50 | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | |--- arrival_month <= 9.50 | | | | | | | | | | |--- weights: [11.93, 10.63] class: 0 | | | | | | | | | |--- arrival_month > 9.50 | | | | | | | | | | |--- weights: [37.28, 0.00] class: 0 | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | |--- arrival_month <= 11.50 | | | | | | | | | | |--- avg_price_per_room <= 119.20 | | | | | | | | | | | |--- weights: [9.69, 28.84] class: 1 | | | | | | | | | | |--- avg_price_per_room > 119.20 | | | | | | | | | | | |--- truncated branch of depth 5 | | | | | | | | | |--- arrival_month > 11.50 | | | | | | | | | | |--- lead_time <= 100.00 | | | | | | | | | | | |--- weights: [49.95, 0.00] class: 0 | | | | | | | | | | |--- lead_time > 100.00 | | | | | | | | | | | |--- weights: [0.75, 18.22] class: 1 | | | | | |--- required_car_parking_space > 0.50 | | | | | | |--- weights: [134.20, 1.52] class: 0 | | |--- no_of_special_requests > 1.50 | | | |--- lead_time <= 90.50 | | | | |--- no_of_days_stayed <= 4.50 | | | | | |--- weights: [1582.81, 13.66] class: 0 | | | | |--- no_of_days_stayed > 4.50 | | | | | |--- no_of_days_stayed <= 12.00 | | | | | | |--- weights: [234.85, 40.99] class: 0 | | | | | |--- no_of_days_stayed > 12.00 | | | | | | |--- weights: [0.00, 3.04] class: 1 | | | |--- lead_time > 90.50 | | | | |--- no_of_special_requests <= 2.50 | | | | | |--- arrival_month <= 8.50 | | | | | | |--- avg_price_per_room <= 202.95 | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | |--- weights: [9.69, 12.14] class: 1 | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | |--- lead_time <= 150.50 | | | | | | | | | |--- no_of_days_stayed <= 5.50 | | | | | | | | | | |--- weights: [154.33, 16.70] class: 0 | | | | | | | | | |--- no_of_days_stayed > 5.50 | | | | | | | | | | |--- weights: [20.88, 12.14] class: 0 | | | | | | | | |--- lead_time > 150.50 | | | | | | | | | |--- weights: [0.00, 4.55] class: 1 | | | | | | |--- avg_price_per_room > 202.95 | | | | | | | |--- weights: [0.00, 10.63] class: 1 | | | | | |--- arrival_month > 8.50 | | | | | | |--- weights: [106.61, 106.27] class: 0 | | | | |--- no_of_special_requests > 2.50 | | | | | |--- weights: [67.10, 0.00] class: 0 |--- lead_time > 151.50 | |--- avg_price_per_room <= 100.04 | | |--- no_of_special_requests <= 0.50 | | | |--- no_of_people <= 1.50 | | | | |--- market_segment_type_Online <= 0.50 | | | | | |--- lead_time <= 163.50 | | | | | | |--- arrival_month <= 5.00 | | | | | | | |--- weights: [2.98, 0.00] class: 0 | | | | | | |--- arrival_month > 5.00 | | | | | | | |--- weights: [0.75, 24.29] class: 1 | | | | | |--- lead_time > 163.50 | | | | | | |--- lead_time <= 341.00 | | | | | | | |--- lead_time <= 173.00 | | | | | | | | |--- arrival_date <= 3.50 | | | | | | | | | |--- weights: [46.97, 9.11] class: 0 | | | | | | | | |--- arrival_date > 3.50 | | | | | | | | | |--- weights: [2.24, 13.66] class: 1 | | | | | | | |--- lead_time > 173.00 | | | | | | | | |--- arrival_month <= 5.50 | | | | | | | | | |--- arrival_date <= 7.50 | | | | | | | | | | |--- weights: [0.00, 4.55] class: 1 | | | | | | | | | |--- arrival_date > 7.50 | | | | | | | | | | |--- weights: [6.71, 0.00] class: 0 | | | | | | | | |--- arrival_month > 5.50 | | | | | | | | | |--- weights: [187.88, 7.59] class: 0 | | | | | | |--- lead_time > 341.00 | | | | | | | |--- weights: [13.42, 27.33] class: 1 | | | | |--- market_segment_type_Online > 0.50 | | | | | |--- avg_price_per_room <= 2.50 | | | | | | |--- weights: [8.95, 3.04] class: 0 | | | | | |--- avg_price_per_room > 2.50 | | | | | | |--- weights: [0.75, 80.46] class: 1 | | | |--- no_of_people > 1.50 | | | | |--- avg_price_per_room <= 82.47 | | | | | |--- market_segment_type_Offline <= 0.50 | | | | | | |--- weights: [2.98, 288.44] class: 1 | | | | | |--- market_segment_type_Offline > 0.50 | | | | | | |--- arrival_month <= 11.50 | | | | | | | |--- lead_time <= 244.00 | | | | | | | | |--- no_of_days_stayed <= 2.50 | | | | | | | | | |--- lead_time <= 166.50 | | | | | | | | | | |--- weights: [2.24, 0.00] class: 0 | | | | | | | | | |--- lead_time > 166.50 | | | | | | | | | | |--- weights: [2.24, 57.69] class: 1 | | | | | | | | |--- no_of_days_stayed > 2.50 | | | | | | | | | |--- arrival_date <= 11.50 | | | | | | | | | | |--- weights: [53.68, 4.55] class: 0 | | | | | | | | | |--- arrival_date > 11.50 | | | | | | | | | | |--- arrival_date <= 15.50 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | | |--- arrival_date > 15.50 | | | | | | | | | | | |--- weights: [44.73, 10.63] class: 0 | | | | | | | |--- lead_time > 244.00 | | | | | | | | |--- arrival_year <= 2017.50 | | | | | | | | | |--- weights: [25.35, 0.00] class: 0 | | | | | | | | |--- arrival_year > 2017.50 | | | | | | | | | |--- avg_price_per_room <= 80.38 | | | | | | | | | | |--- avg_price_per_room <= 76.00 | | | | | | | | | | | |--- weights: [10.44, 245.93] class: 1 | | | | | | | | | | |--- avg_price_per_room > 76.00 | | | | | | | | | | | |--- truncated branch of depth 2 | | | | | | | | | |--- avg_price_per_room > 80.38 | | | | | | | | | | |--- weights: [7.46, 0.00] class: 0 | | | | | | |--- arrival_month > 11.50 | | | | | | | |--- weights: [46.97, 0.00] class: 0 | | | | |--- avg_price_per_room > 82.47 | | | | | |--- lead_time <= 324.50 | | | | | | |--- arrival_month <= 11.50 | | | | | | | |--- room_type_reserved_Room_Type 4 <= 0.50 | | | | | | | | |--- weights: [10.44, 997.40] class: 1 | | | | | | | |--- room_type_reserved_Room_Type 4 > 0.50 | | | | | | | | |--- market_segment_type_Offline <= 0.50 | | | | | | | | | |--- weights: [0.00, 10.63] class: 1 | | | | | | | | |--- market_segment_type_Offline > 0.50 | | | | | | | | | |--- weights: [4.47, 0.00] class: 0 | | | | | | |--- arrival_month > 11.50 | | | | | | | |--- market_segment_type_Online <= 0.50 | | | | | | | | |--- weights: [7.46, 0.00] class: 0 | | | | | | | |--- market_segment_type_Online > 0.50 | | | | | | | | |--- weights: [0.00, 19.74] class: 1 | | | | | |--- lead_time > 324.50 | | | | | | |--- avg_price_per_room <= 89.00 | | | | | | | |--- weights: [5.96, 0.00] class: 0 | | | | | | |--- avg_price_per_room > 89.00 | | | | | | | |--- weights: [0.75, 13.66] class: 1 | | |--- no_of_special_requests > 0.50 | | | |--- market_segment_type_Offline <= 0.50 | | | | |--- lead_time <= 180.50 | | | | | |--- lead_time <= 159.50 | | | | | | |--- arrival_month <= 8.50 | | | | | | | |--- weights: [20.88, 9.11] class: 0 | | | | | | |--- arrival_month > 8.50 | | | | | | | |--- weights: [3.73, 13.66] class: 1 | | | | | |--- lead_time > 159.50 | | | | | | |--- weights: [94.69, 21.25] class: 0 | | | | |--- lead_time > 180.50 | | | | | |--- no_of_days_stayed <= 3.50 | | | | | | |--- no_of_special_requests <= 2.50 | | | | | | | |--- weights: [46.22, 186.73] class: 1 | | | | | | |--- no_of_special_requests > 2.50 | | | | | | | |--- weights: [8.20, 0.00] class: 0 | | | | | |--- no_of_days_stayed > 3.50 | | | | | | |--- arrival_year <= 2017.50 | | | | | | | |--- weights: [14.17, 0.00] class: 0 | | | | | | |--- arrival_year > 2017.50 | | | | | | | |--- weights: [115.56, 133.59] class: 1 | | | |--- market_segment_type_Offline > 0.50 | | | | |--- lead_time <= 348.50 | | | | | |--- weights: [127.49, 7.59] class: 0 | | | | |--- lead_time > 348.50 | | | | | |--- weights: [5.96, 6.07] class: 1 | |--- avg_price_per_room > 100.04 | | |--- arrival_month <= 11.50 | | | |--- no_of_special_requests <= 2.50 | | | | |--- weights: [0.00, 3200.19] class: 1 | | | |--- no_of_special_requests > 2.50 | | | | |--- weights: [23.11, 0.00] class: 0 | | |--- arrival_month > 11.50 | | | |--- no_of_special_requests <= 0.50 | | | | |--- weights: [35.04, 0.00] class: 0 | | | |--- no_of_special_requests > 0.50 | | | | |--- arrival_date <= 24.50 | | | | | |--- weights: [3.73, 0.00] class: 0 | | | | |--- arrival_date > 24.50 | | | | | |--- weights: [3.73, 22.77] class: 1
# Visualizing feature importance
importances = best_model.feature_importances_
indices = np.argsort(importances)
plt.figure(figsize=(12, 12))
plt.title("Feature Importance (Post Pruning)")
plt.barh(range(len(indices)), importances[indices], color="violet", align="center")
plt.yticks(range(len(indices)), [feature_names[i] for i in indices])
plt.xlabel("Relative Importance")
plt.show()
# Training performance comparison
models_train_comp_df = pd.concat(
[
decision_tree_perf_train.T,
decision_tree_tune_perf_train.T,
decision_tree_postpruned_perf_train.T,
],
axis=1,
)
models_train_comp_df.columns = [
"Decision Tree sklearn",
"Decision Tree (Pre-Pruning)",
"Decision Tree (Post-Pruning)",
]
print("Training performance comparison:")
models_train_comp_df
Training performance comparison:
| Decision Tree sklearn | Decision Tree (Pre-Pruning) | Decision Tree (Post-Pruning) | |
|---|---|---|---|
| Accuracy | 0.993108 | 0.882443 | 0.875158 |
| Recall | 0.995097 | 0.918331 | 0.895253 |
| Precision | 0.984153 | 0.769385 | 0.765464 |
| F1 | 0.989595 | 0.837285 | 0.825287 |
# Test performance comparison
models_train_comp_df = pd.concat(
[
decision_tree_perf_test.T,
decision_tree_tune_perf_test.T,
decision_tree_postpruned_perf_test.T,
],
axis=1,
)
models_train_comp_df.columns = [
"Decision Tree sklearn",
"Decision Tree (Pre-Pruning)",
"Decision Tree (Post-Pruning)",
]
print("Test set performance comparison:")
models_train_comp_df
Test set performance comparison:
| Decision Tree sklearn | Decision Tree (Pre-Pruning) | Decision Tree (Post-Pruning) | |
|---|---|---|---|
| Accuracy | 0.863089 | 0.846733 | 0.852706 |
| Recall | 0.810619 | 0.871664 | 0.866837 |
| Precision | 0.776237 | 0.716286 | 0.729162 |
| F1 | 0.793056 | 0.786373 | 0.792061 |
# Training performance comparison for logistic regression models
models_train_comp_df = pd.concat(
[
log_reg_model_train_perf.T,
log_reg_model_train_perf_threshold_auc_roc.T,
log_reg_model_train_perf_threshold_curve.T,
],
axis=1,
)
models_train_comp_df.columns = [
"Logistic Regression statsmodel",
"Logistic Regression-0.35 Threshold",
"Logistic Regression-0.39 Threshold",
]
print("Training performance comparison for logistic regression models:")
models_train_comp_df
Training performance comparison for logistic regression models:
| Logistic Regression statsmodel | Logistic Regression-0.35 Threshold | Logistic Regression-0.39 Threshold | |
|---|---|---|---|
| Accuracy | 0.804624 | 0.787138 | 0.795723 |
| Recall | 0.628124 | 0.752840 | 0.721392 |
| Precision | 0.739443 | 0.653519 | 0.678628 |
| F1 | 0.679253 | 0.699672 | 0.699357 |
# Testing performance comparison for logistic regression models
models_test_comp_df = pd.concat(
[
log_reg_model_test_perf.T,
log_reg_model_test_perf_threshold_auc_roc.T,
log_reg_model_test_perf_threshold_curve.T,
],
axis=1,
)
models_test_comp_df.columns = [
"Logistic Regression statsmodel",
"Logistic Regression-0.35 Threshold",
"Logistic Regression-0.39 Threshold",
]
print("Test set performance comparison for logistic regression models:")
models_test_comp_df
Test set performance comparison for logistic regression models:
| Logistic Regression statsmodel | Logistic Regression-0.35 Threshold | Logistic Regression-0.39 Threshold | |
|---|---|---|---|
| Accuracy | 0.803179 | 0.792520 | 0.800331 |
| Recall | 0.626917 | 0.760080 | 0.727712 |
| Precision | 0.727273 | 0.654523 | 0.678581 |
| F1 | 0.673376 | 0.703363 | 0.702288 |
# Training performance comparison for decision trees
models_train_comp_df = pd.concat(
[
decision_tree_perf_train.T,
decision_tree_tune_perf_train.T,
decision_tree_postpruned_perf_train.T,
],
axis=1,
)
models_train_comp_df.columns = [
"Decision Tree sklearn",
"Decision Tree (Pre-Pruning)",
"Decision Tree (Post-Pruning)",
]
print("Training performance comparison for decision trees:")
models_train_comp_df
Training performance comparison for decision trees:
| Decision Tree sklearn | Decision Tree (Pre-Pruning) | Decision Tree (Post-Pruning) | |
|---|---|---|---|
| Accuracy | 0.993108 | 0.882443 | 0.875158 |
| Recall | 0.995097 | 0.918331 | 0.895253 |
| Precision | 0.984153 | 0.769385 | 0.765464 |
| F1 | 0.989595 | 0.837285 | 0.825287 |
# Test performance comparison for decision trees
models_train_comp_df = pd.concat(
[
decision_tree_perf_test.T,
decision_tree_tune_perf_test.T,
decision_tree_postpruned_perf_test.T,
],
axis=1,
)
models_train_comp_df.columns = [
"Decision Tree sklearn",
"Decision Tree (Pre-Pruning)",
"Decision Tree (Post-Pruning)",
]
print("Test set performance comparison for decision trees:")
models_train_comp_df
Test set performance comparison for decision trees:
| Decision Tree sklearn | Decision Tree (Pre-Pruning) | Decision Tree (Post-Pruning) | |
|---|---|---|---|
| Accuracy | 0.863089 | 0.846733 | 0.852706 |
| Recall | 0.810619 | 0.871664 | 0.866837 |
| Precision | 0.776237 | 0.716286 | 0.729162 |
| F1 | 0.793056 | 0.786373 | 0.792061 |
If not already present, institute a no cancelation policy within 24 hours of a booking. As seen in the data, there were canceled bookings that occurred with 0 day lead time. If policy is instituted, it would be a way for INN Hotels to collect money for violating the cancelation policy and still sell the room for the night.
Late spring and summer have higher rates of cancelations compared to fall/winter months. Since spring/summer are some of the busiest vacation times, a no refund in the event of cancellation could be instituted for those spring/summer months.
In lieu of offering refunds, INN Hotels can institute a policy with a certain lead time, any cancellations will become a credit that can be applied to a future stay. This could be applicable if it is during a busy season.
Play up willigness to accomodate special requests as seen in the booking data, the more special requests that a guest had, the chances were very high that a guest wouldn't cancel the booking.
Play up the need for parking. Per the data, guests that needed a parking space canceled much less than guests that don't. INN Hotels might be one of a few hotels that offers parking.
Find ways to incentivize new guests to become repeat guests. As seen in the data, repeat guests had a much lower cancelation rate than new guests.